Page MenuHomeFreeBSD

vm_object: Assert that managed pages are on pagequeues when freeing
ClosedPublic

Authored by markj on Oct 4 2024, 3:08 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Nov 14, 1:42 AM
Unknown Object (File)
Thu, Nov 14, 12:05 AM
Unknown Object (File)
Thu, Nov 7, 8:30 PM
Unknown Object (File)
Thu, Nov 7, 9:41 AM
Unknown Object (File)
Oct 15 2024, 6:55 AM
Unknown Object (File)
Oct 10 2024, 4:59 AM
Unknown Object (File)
Oct 6 2024, 10:40 PM
Unknown Object (File)
Oct 6 2024, 10:51 AM
Subscribers

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj requested review of this revision.Oct 4 2024, 3:08 PM
This revision is now accepted and ready to land.Oct 4 2024, 6:17 PM

Consider vm_page_grab_valid(). It allocates a page, and tries to read the page' content from pager. If the pager failed, the page is freed. I do not believe that the freed invalid page is enqueued there.

In D46945#1070223, @kib wrote:

Consider vm_page_grab_valid(). It allocates a page, and tries to read the page' content from pager. If the pager failed, the page is freed. I do not believe that the freed invalid page is enqueued there.

But there the page is freed immediately, not during object teardown, where the assertion is added.

In D46945#1070223, @kib wrote:

Consider vm_page_grab_valid(). It allocates a page, and tries to read the page' content from pager. If the pager failed, the page is freed. I do not believe that the freed invalid page is enqueued there.

But there the page is freed immediately, not during object teardown, where the assertion is added.

The situation where invalid page was left on the object queue was quite common until recently.

In D46945#1070436, @kib wrote:
In D46945#1070223, @kib wrote:

Consider vm_page_grab_valid(). It allocates a page, and tries to read the page' content from pager. If the pager failed, the page is freed. I do not believe that the freed invalid page is enqueued there.

But there the page is freed immediately, not during object teardown, where the assertion is added.

The situation where invalid page was left on the object queue was quite common until recently.

Sure, but even in that case the page must still be added to a page queue, otherwise it is leaked until the VM object is destroyed. The assertion is meant to catch exactly this kind of bug.

In D46945#1070436, @kib wrote:
In D46945#1070223, @kib wrote:

Consider vm_page_grab_valid(). It allocates a page, and tries to read the page' content from pager. If the pager failed, the page is freed. I do not believe that the freed invalid page is enqueued there.

But there the page is freed immediately, not during object teardown, where the assertion is added.

The situation where invalid page was left on the object queue was quite common until recently.

Sure, but even in that case the page must still be added to a page queue, otherwise it is leaked until the VM object is destroyed. The assertion is meant to catch exactly this kind of bug.

Yes and yes. But it happens (happen), and the assert is too late to catch the moment.

I do not object against this change, only pointing out my experience.

In D46945#1070648, @kib wrote:
In D46945#1070436, @kib wrote:
In D46945#1070223, @kib wrote:

Consider vm_page_grab_valid(). It allocates a page, and tries to read the page' content from pager. If the pager failed, the page is freed. I do not believe that the freed invalid page is enqueued there.

But there the page is freed immediately, not during object teardown, where the assertion is added.

The situation where invalid page was left on the object queue was quite common until recently.

Sure, but even in that case the page must still be added to a page queue, otherwise it is leaked until the VM object is destroyed. The assertion is meant to catch exactly this kind of bug.

Yes and yes. But it happens (happen), and the assert is too late to catch the moment.

I do not object against this change, only pointing out my experience.

I'm less enthusiastic about it. I'm concerned that people, who are not us, will incorrectly conclude that this assertion should hold everywhere.

In D46945#1070681, @alc wrote:
In D46945#1070648, @kib wrote:
In D46945#1070436, @kib wrote:
In D46945#1070223, @kib wrote:

Consider vm_page_grab_valid(). It allocates a page, and tries to read the page' content from pager. If the pager failed, the page is freed. I do not believe that the freed invalid page is enqueued there.

But there the page is freed immediately, not during object teardown, where the assertion is added.

The situation where invalid page was left on the object queue was quite common until recently.

Sure, but even in that case the page must still be added to a page queue, otherwise it is leaked until the VM object is destroyed. The assertion is meant to catch exactly this kind of bug.

Yes and yes. But it happens (happen), and the assert is too late to catch the moment.

I do not object against this change, only pointing out my experience.

I'm less enthusiastic about it. I'm concerned that people, who are not us, will incorrectly conclude that this assertion should hold everywhere.

Would a comment help here? I'd really like to have a check of this kind: I've had at least one (probably two) bugs which can leak pages when unwiring them, and the problem only becomes apparent once a significant number of pages disappear, which can be hard to notice. This assertion doesn't catch problems at the source, true, but 1) in the case of the bugs I have in mind, that is impossible, 2) having information about the VM object involved in a leak (which may be spread among many objects and so not apparent in vmstat -o output) is a useful clue.