Page MenuHomeFreeBSD

Fix another race between vm_map_protect() and vm_map_wire().
ClosedPublic

Authored by kib on Apr 28 2019, 3:47 PM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 20 2023, 12:18 AM
Unknown Object (File)
Jun 28 2023, 9:23 PM
Unknown Object (File)
Jun 26 2023, 11:10 PM
Unknown Object (File)
Jun 15 2023, 7:13 PM
Unknown Object (File)
May 6 2023, 9:49 PM
Subscribers

Details

Summary

vm_map_wire() increments entry->wire_count, after that it drops the map lock both for faulting in the entry' pages, and for marking next entry in the requested region as IN_TRANSITION. Only after all entries are faulted in, MAP_ENTRY_USER_WIRE flag is set.

This makes it possible for vm_map_protect() to run while other entry MAP_ENTRY_IN_TRANSITION flag is handled, and vm_map_busy() lock does not prevent it. In particular, if the call to vm_map_protect() adds VM_PROT_WRITE to CoW entry, it would fail to call vm_fault_copy_entry(). There are at least two consequences of the race: the top object in the shadow chain is not populated with writeable pages, and second, the entry eventually get contradictory flags MAP_ENTRY_NEEDS_COPY | MAP_ENTRY_USER_WIRED with VM_PROT_WRITE set.

Handle it by waiting for all MAP_ENTRY_IN_TRANSITION flags to go away in vm_map_protect(), which does not drop map lock afterwards. Note that vm_map_busy_wait() is left as is.

Reported by: pho

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/vm/vm_map.c
2402 ↗(On Diff #56780)

You could find the first "in_transition" entry in the "first pass" loop above, and then, if the "first_pass" loop completed, and if an "in_transition" entry had been found, mark it for wakeup and try again. That would save you a list traversal.

I am missing something. Why isn't the solution to busy the map when dropping the map lock to handle IN_TRANSITION? Otherwise, doesn't vmspace_fork() still have a similar problem?

I am missing something. Why isn't the solution to busy the map when dropping the map lock to handle IN_TRANSITION? Otherwise, doesn't vmspace_fork() still have a similar problem?

In fact not. If you look closely, you would note that:

  • vmspace_fork() (actually vm_map_copy_entry()) tests for entry->wired_count > 0 to call vm_fault_copy_entry()
  • vm_map_protect() tests for MAP_ENTRY_USER_WIRED for the call

And then vm_fault_copy_entry() does very different things for cases, despite the overall structure is close enough to keep it in same function.

vm_map_wire() increments entry->wire_count before marking the map as busy and dropping vm_map lock to call vm_fault() on pages, but clears IN_TRANSITION and set MAP_ENTRY_USER_WIRED only after all entries are faulted in. If your argument is that vm_map_protect() should check wired_count > 0, I belive that this is semantically incorrect. If you mean that vmspace_fork() should ensure that all IN_TRANSITION flags should drain, then it might be but I do not see it as necessary, because non-busy map and wired_count > 0 do guarantee that all pages are already faulted into the top object.

sys/vm/vm_map.c
2402 ↗(On Diff #56780)

Yes, I do not want to sleep for IN_TRANSITION if some entry does not satisfy the protection check. I will implement your suggestion in the next revision of the patch.

Eliminate added loop as suggested by dougm.

sys/vm/vm_map.c
2359 ↗(On Diff #56821)

As an avoider of gotos, I suggest that you consider a do-while loop here instead.

2393 ↗(On Diff #56821)

I expect that you mean this to be "!= 0"; otherwise you have changed the meaning of the patch substantially since the previous iteration.

2396 ↗(On Diff #56821)

In the previous iteration, entry->eflags was not marked for an in-transition entry if the function returned somewher in the first loop. Now, with this version, an in-transition entry will be marked for wakeup, even if a later entry leads to early return.

I had expected you to save the first found entry and mark it for wakeup only if there was such an entry found, after the loop. Do you intend this behavior change? Do you want to mark only one entry for wakeup, or all of them?

kib marked 2 inline comments as done.Apr 29 2019, 7:53 PM
kib added inline comments.
sys/vm/vm_map.c
2359 ↗(On Diff #56821)

This would increase the indent level for very unlikely code path. I do not see anything wrong with goto.

2393 ↗(On Diff #56821)

Thank you, this is stupid thinko.

2396 ↗(On Diff #56821)

At worst it would cause unused wakeup() call, but I see you point. I re-wrote it exactly as you want.

kib marked 2 inline comments as done.

Fix flag check, the condition was reverted.
Avoid spurious wakeup.

This now wakes up the last in-transition entry instead of the first one, but I expect that it doesn't matter. I've got nothing else to offer here.

This revision is now accepted and ready to land.Apr 29 2019, 8:07 PM
This revision was automatically updated to reflect the committed changes.
head/sys/vm/vm_map.c
2392

Should 'current' replace 'entry' here?

head/sys/vm/vm_map.c
2392

Yes. I don't see how repeatedly checking the same entry 'entry' on every iteration is correct.