If the first page is not fully valid, or the attempt to page in
subsequent pages fails, we currently free the pages. However, the pages
may have already been resident and wired by mlock(). This can happen if
execve() attempts to activate an image which has been mlock()ed and
truncated.
Details
- Reviewers
alc kib - Commits
- rS352748: Fix handling of invalid pages in exec_map_first_page().
syzkaller found the bug and generated a reproducer. I verified that this
patch fixes the crash triggered by that reproducer.
Diff Detail
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 26658 Build 25032: arc lint + arc unit
Event Timeline
What is the point of deactivating them ? Inactive scan frees invalid pages upon encountering them.
As opposed to doing nothing? vm_page_deactivate_noreuse() ensures that the pages skip LRU, so the page daemon (which handles races with concurrent wirings) will free them sooner.
Indeed, in the first block we could simply free the page if vm_page_unwire_noq() returns true.
I thought the second block was trickier, but now I see it is sufficient to simply check vm_page_wired().
sys/kern/kern_exec.c | ||
---|---|---|
1012 | Obviously this should be ma[0], I did not commit the typo fix before uploading. |
My review description is confusing. A concurrent truncation is not needed to trigger the problem. The key is that an mlock() of a read-only mapping of a file forces the backing pages to stay resident after a truncation. A subsequent execve() of the file will trigger the assertion, since the pages will be invalid and wired.
Here is a program that reproduces the panic: https://reviews.freebsd.org/P319
The program that syzkaller generated uses fewer system calls but is multi-threaded.
Yes, and I think that a similar effect can be achieved with msync(MS_INVALIDATE). BTW, we can disable MS_INVALIDATE on text mappings now.
That said, what is the use of msync(MS_INVALIDATE) except for writing test cases ?
Probably, though note that msync(MS_INVALIDATE) fails if part of the region is user-wired.
Are you suggesting to check vn_writechk() in vm_object_sync()?
That said, what is the use of msync(MS_INVALIDATE) except for writing test cases ?
I'm not sure. I wonder if it made more sense before the VM and buffer cache unification. Maybe it is still useful for some types of device mappings where userspace must manage coherence.