Page MenuHomeFreeBSD

vm_fault_hold: use vm_page_replace instead of vm_page_rename
ClosedPublic

Authored by rlibby on Dec 9 2015, 8:38 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 26, 4:52 AM
Unknown Object (File)
Sat, Jan 14, 2:24 PM
Unknown Object (File)
Sat, Jan 14, 2:22 PM
Unknown Object (File)
Sat, Jan 14, 2:21 PM
Unknown Object (File)
Sat, Jan 14, 2:21 PM
Unknown Object (File)
Sat, Jan 14, 2:21 PM
Unknown Object (File)
Sat, Jan 14, 2:21 PM
Unknown Object (File)
Sat, Jan 14, 2:16 PM
Subscribers

Details

Summary

Using vm_page_replace instead of vm_page_rename leads to slightly
cheaper object manipulation (remove, replace vs remove, remove, insert)
and avoids the possibility of the RADIX NODE allocation failing, which
could trigger retry of the fault.

Follow up from D4326 and D4471.

May depend on fortification of vm_page_replace.

Test Plan

kyua test -k /usr/tests/sys/Kyuafile

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

rlibby retitled this revision from to vm_fault_hold: use vm_page_replace instead of vm_page_rename.
rlibby updated this object.
rlibby edited the test plan for this revision. (Show Details)
kib edited edge metadata.
kib added inline comments.
sys/vm/vm_fault.c
847 ↗(On Diff #11001)

I would prefer to see this done with KASSERT. E.g. you could unconditionally store mold from vm_page_replace() into a local var, and operate on the local below where fs.first_m is used in the current code (to pacify compilers which warn about assigned but not used variables, in !INVARIANTS case). KASSERT would be mret == fs.first_m.

This revision is now accepted and ready to land.Dec 10 2015, 10:06 AM
sys/vm/vm_fault.c
847 ↗(On Diff #11001)

Okay, I can do something like that.

Here's another possibility. Currently existing callers of vm_page_replace are doing the same thing, they all know the actual page they are trying to replace and assert (panic) on it post-hoc. Again for what it's worth, this includes Isilon's internal callers.

Therefore it seems to me that we could also either change the interface or provide a wrapper that essentially does

void
vm_page_replace_XXX(vm_page_t mold, vm_page_t mnew)
{
	vm_page_t mtmp;

	mtmp = vm_page_replace(mnew, mold->object, mold->pindex);
	/* XXX KASSERT/panic/whatever */
}

Another possibility would be to have the interface take all four args (object, pindex, mold, mnew) so that it could provide the strongest assertions.

Thoughts?

sys/vm/vm_fault.c
847 ↗(On Diff #11001)

Inline vm_page_replace_XXX seems to be best for me. It would allow compiler to place only a call to vm_page_replace() in case of !INVARIANTS, and simplifies expression at the callee side.

sys/vm/vm_fault.c
847 ↗(On Diff #11001)

Okay. Personally I think I would most like just to change the interface outright, but I will defer to your and alc's judgement here.

For the wrapper, how does this look (somewhere in vm_page.h):

static inline void
vm_page_replace_check(vm_page_t mnew, vm_object_t object, vm_pindex_t pindex,
    vm_page_t mold)
{
	vm_page_t mret;

	mret = vm_page_replace(mnew, object, pindex);
	KASSERT(mret == mold,
	    ("invalid page replacement, mold=%p, mret=%p", mold, mret));

	/* Unused if !INVARIANTS. */
	(void)mold;
	(void)mret;
}

I was thinking to prefer the four-argument version to the two-argument one because I think the assertion mostly protects against mold not having had the right identity, rather than the radix code not working.

If the cast-to-void pattern is not acceptable then I'd suggest INVARIANTS-dependent alternate macro definitions instead.

sys/vm/vm_fault.c
847 ↗(On Diff #11001)

If changing the vm_page_replace() signature, mold is unused for !INVARIANTS case, but it must be passed.

What you proposed above looks fine to me, thanks.

rlibby edited edge metadata.

vm_fault_hold: kib feedback: use KASSERTing version of vm_page_replace

Updated to use D4497.

This revision now requires review to proceed.Dec 11 2015, 5:06 PM
kib edited edge metadata.
kib added inline comments.
sys/vm/vm_fault.c
839 ↗(On Diff #11136)

_G_rab the page ...

849 ↗(On Diff #11136)

Comment is absolutely trivial, if you want to left it in code, please use the proper sentence.

This revision is now accepted and ready to land.Dec 11 2015, 5:58 PM
rlibby edited edge metadata.

vm_fault_hold: kib feedback: fix up comments

This revision now requires review to proceed.Dec 11 2015, 6:16 PM
rlibby added inline comments.
sys/vm/vm_fault.c
849 ↗(On Diff #11136)

Nope, I agree.

kib edited edge metadata.
This revision is now accepted and ready to land.Dec 11 2015, 6:35 PM

If we count vm_fault as a caller, then all three callers to vm_page_replace{,_checked} acquire the page lock on "mold" shortly (or even immediately) after vm_page_replace{,_checked} returns. I think that we should extend the coverage of the page lock on "mold" to cover the call to vm_page_replace{,_checked}. So, this use case would look like:

vm_page_lock(fs.first_m);
vm_page_replace_checked(fs.m, fs.first_object,
    fs.first_pindex, fs.first_m);
vm_page_free(fs.first_m);
vm_page_unlock(fs.first_m);
vm_page_dirty(fs.m);

At the same time, I would add the equivalent assertion to the beginning of vm_page_replace() as you find at the beginning of vm_page_remove() on "mold":

if ((mold->oflags & VPO_UNMANAGED) == 0)
        vm_page_assert_locked(mold);

This would address the safety question about clearing "mold->object" in vm_page_replace().

A similar observation could be made about vm_page_free, i.e., that we always call it after replace, and so it could be the last step in vm_page_replace. However, based upon the use of vm_page_replace in D4444, I would argue against calling vm_page_free from within vm_page_replace.

Personally, I would have eliminated the "object" and "pindex" parameters to vm_page_replace_checked. I generally look at such redundancy as just offering callers another way to screw up the call. However, I don't feel strongly enough to argue this point.

In D4478#95608, @alc wrote:

If we count vm_fault as a caller, then all three callers to vm_page_replace{,_checked} acquire the page lock on "mold" shortly (or even immediately) after vm_page_replace{,_checked} returns. I think that we should extend the coverage of the page lock on "mold" to cover the call to vm_page_replace{,_checked}. So, this use case would look like:

vm_page_lock(fs.first_m);
vm_page_replace_checked(fs.m, fs.first_object,
    fs.first_pindex, fs.first_m);
vm_page_free(fs.first_m);
vm_page_unlock(fs.first_m);
vm_page_dirty(fs.m);

At the same time, I would add the equivalent assertion to the beginning of vm_page_replace() as you find at the beginning of vm_page_remove() on "mold":

if ((mold->oflags & VPO_UNMANAGED) == 0)
        vm_page_assert_locked(mold);

This would address the safety question about clearing "mold->object" in vm_page_replace().

I agree with your suggestion of having the same page locking semantics as vm_page_remove. I think that vm_page_replace shoul basically behave like an optimized remove and insert.

A similar observation could be made about vm_page_free, i.e., that we always call it after replace, and so it could be the last step in vm_page_replace. However, based upon the use of vm_page_replace in D4444, I would argue against calling vm_page_free from within vm_page_replace.

I agree, and for what it's worth Isilon's internal callers also do not want the old page freed.

Personally, I would have eliminated the "object" and "pindex" parameters to vm_page_replace_checked. I generally look at such redundancy as just offering callers another way to screw up the call. However, I don't feel strongly enough to argue this point.

Yeah, I like that better as an interface too but it doesn't provide as strong of a check about the identity of the old page. Maybe another option would be to have two functions

#define vm_page_assert_name(m, object, pindex) \
	KASSERT(m->object == object, m->pindex == pindex, \
	    ("%s: page %p not in object %p at %ju",
	    __func__, m, object, (uintmax_t)pindex))
void vm_page_replace(vm_page_t mold, vm_page_t mnew);

and then callers would go back to both calling replace and asserting.

vm_page_assert_name(fs.first_m, fs.first_object, fs.first_pindex);
vm_page_replace(fs.first_m, fs.m);

Anyway that leaves two things:

  • The xbusy state of the old page. In vm_page_remove, this is optional. If vm_page_replace is "just" an optimized remove and insert, then it should be optional there too. I would propose to factor out the if (vm_page_xbusied(m)) block from vm_page_remove into a new procedure called vm_page_remove_xunbusy and call it from both places.
  • I don't think we should assert about the old page being on a paging queue, again like vm_page_remove does not.

I can write this up in code if you would like.

In terms of your D4444, I think this would allow you not to xbusy the page before calling vm_page_replace (just as the corresponding else branch there does not xbusy the page for the call to vm_page_remove).

In D4478#95654, @rlibby_gmail.com wrote:

Anyway that leaves two things:

  • The xbusy state of the old page. In vm_page_remove, this is optional. If vm_page_replace is "just" an optimized remove and insert, then it should be optional there too. I would propose to factor out the if (vm_page_xbusied(m)) block from vm_page_remove into a new procedure called vm_page_remove_xunbusy and call it from both places.
  • I don't think we should assert about the old page being on a paging queue, again like vm_page_remove does not.

I can write this up in code if you would like.

For illustration (this is assuming previous patches which are reviewed but not committed):
https://github.com/rlibby/freebsd/commit/e0143337eac22d1aad64f04ea8c0ec4e27de5f31

If you agree with the direction here I can submit it for review, maybe after breaking up into smaller patches.

In D4478#96391, @rlibby_gmail.com wrote:

For illustration (this is assuming previous patches which are reviewed but not committed):
https://github.com/rlibby/freebsd/commit/e0143337eac22d1aad64f04ea8c0ec4e27de5f31

If you agree with the direction here I can submit it for review, maybe after breaking up into smaller patches.

Can we postpone this until the already marked reviews are committed ? What is needed to commit them ?

In D4478#96861, @kib wrote:
In D4478#96391, @rlibby_gmail.com wrote:

For illustration (this is assuming previous patches which are reviewed but not committed):
https://github.com/rlibby/freebsd/commit/e0143337eac22d1aad64f04ea8c0ec4e27de5f31

If you agree with the direction here I can submit it for review, maybe after breaking up into smaller patches.

Can we postpone this until the already marked reviews are committed ? What is needed to commit them ?

Yeah, apologies for the jumble of patches. If there's a better way to submit a patch series I would appreciate a pointer.

D4577 and D4497 just need a committer. I will coordinate with @cem today and see if he can help.

For this one (D4478), I am just a little nervous about possible regressions, my reservations being the differences between vm_page_replace and current behavior (locking of mold, PQ_NONE assert). If we are comfortable that those are more cleanliness issues than actual problems then this patch too can proceed as is.

In D4478#97033, @rlibby_gmail.com wrote:
In D4478#96861, @kib wrote:
In D4478#96391, @rlibby_gmail.com wrote:

For illustration (this is assuming previous patches which are reviewed but not committed):
https://github.com/rlibby/freebsd/commit/e0143337eac22d1aad64f04ea8c0ec4e27de5f31

If you agree with the direction here I can submit it for review, maybe after breaking up into smaller patches.

Can we postpone this until the already marked reviews are committed ? What is needed to commit them ?

Yeah, apologies for the jumble of patches. If there's a better way to submit a patch series I would appreciate a pointer.

D4577 and D4497 just need a committer. I will coordinate with @cem today and see if he can help.

For this one (D4478), I am just a little nervous about possible regressions, my reservations being the differences between vm_page_replace and current behavior (locking of mold, PQ_NONE assert). If we are comfortable that those are more cleanliness issues than actual problems then this patch too can proceed as is.

Revise the locking in this patch as I've described above and get Conrad to commit it. The PQ_NONE assertion is not a problem in this case, i.e., vm_fault.

We can deal with the internals of vm_page_replace later.

rlibby edited edge metadata.

Revised locking per alc

This revision now requires review to proceed.Dec 23 2015, 12:38 AM
In D4478#97502, @alc wrote:
In D4478#97033, @rlibby_gmail.com wrote:

For this one (D4478), I am just a little nervous about possible regressions, my reservations being the differences between vm_page_replace and current behavior (locking of mold, PQ_NONE assert). If we are comfortable that those are more cleanliness issues than actual problems then this patch too can proceed as is.

Revise the locking in this patch as I've described above and get Conrad to commit it. The PQ_NONE assertion is not a problem in this case, i.e., vm_fault.

We can deal with the internals of vm_page_replace later.

Revised locking, as advised, just for vm_fault. It wasn't clear to me if you also wanted to see the other callers altered in this patch so I left it out for now (i.e. status quo for them).

One last thing to point out. Since we now get the (mold) page lock but do an unconditional xunbusy in vm_page_replace, the possibility now seems to exist that we will recurse on the page lock if the busy waiters bit is somehow set. Again I am not sure if this is actually possible from vm_fault, and I did not encounter it in testing. (Eventually I think we should deal with this as vm_page_remove does.)

In D4478#98946, @rlibby_gmail.com wrote:

One last thing to point out. Since we now get the (mold) page lock but do an unconditional xunbusy in vm_page_replace, the possibility now seems to exist that we will recurse on the page lock if the busy waiters bit is somehow set. Again I am not sure if this is actually possible from vm_fault, and I did not encounter it in testing. (Eventually I think we should deal with this as vm_page_remove does.)

first_m should not be yet added to the object page queue, and this prevents other threads from finding the first_m at all. As consequence, the page should not have waiters. But the point is valid and I do not think that vm_page_xunbusy() was designed to be used for the locked page. You may consider adding vm_page_unbusy_locked().

In D4478#98997, @kib wrote:
In D4478#98946, @rlibby_gmail.com wrote:

One last thing to point out. Since we now get the (mold) page lock but do an unconditional xunbusy in vm_page_replace, the possibility now seems to exist that we will recurse on the page lock if the busy waiters bit is somehow set. Again I am not sure if this is actually possible from vm_fault, and I did not encounter it in testing. (Eventually I think we should deal with this as vm_page_remove does.)

first_m should not be yet added to the object page queue, and this prevents other threads from finding the first_m at all. As consequence, the page should not have waiters. But the point is valid and I do not think that vm_page_xunbusy() was designed to be used for the locked page. You may consider adding vm_page_unbusy_locked().

Okay, good. You can see my intended solution under vm_page_remove_xunbusy at that github link above.

Do you want to proceed with this patch as-is and come back to the vm_page_replace internals? Or if you want to address the internals first then I can submit that patch on github as a diff here (but drop the vm_fault.c bits). I'm okay with however you guys want to approach this.

In D4478#99108, @rlibby_gmail.com wrote:

Okay, good. You can see my intended solution under vm_page_remove_xunbusy at that github link above.

Could you, please, provide the explicit link ?

In D4478#99388, @kib wrote:
In D4478#99108, @rlibby_gmail.com wrote:

Okay, good. You can see my intended solution under vm_page_remove_xunbusy at that github link above.

Could you, please, provide the explicit link ?

Yep: https://github.com/rlibby/freebsd/commit/e0143337eac22d1aad64f04ea8c0ec4e27de5f31

In D4478#99389, @rlibby_gmail.com wrote:

I do not like the propagation of the locked checks around. Unlocked page may be only passed to vm_page_remove() from vm_page_free_toq(), it seems. Could we e.g. ensure that the page is always locked in free_toq() ?

If not, I still suggest to have a function like vm_page_xunbusy_locked(), which does the same action as xunbusy(), but assumes that the page is already locked.

In D4478#99420, @kib wrote:
In D4478#99389, @rlibby_gmail.com wrote:

I do not like the propagation of the locked checks around. Unlocked page may be only passed to vm_page_remove() from vm_page_free_toq(), it seems. Could we e.g. ensure that the page is always locked in free_toq() ?

Unfortuantely I don't think so. It looks like vm_page_free_toq is called with unlocked pages all over the pmap code via vm_page_free{,_zero}.

I wonder if it is possible just to assert in vm_page_free_toq that unmanaged pages are not in a vm object anyway, and then not call vm_page_remove for them. I don't know enough to say.

If not, I still suggest to have a function like vm_page_xunbusy_locked(), which does the same action as xunbusy(), but assumes that the page is already locked.

If we can't enforce vm_page_remove is called with locked pages, then I think this would become

if page is locked:
    vm_page_xunbusy_locked()
else:
    vm_page_xunbusy()

and I'm not sure I see this as better. It would still have the lock check and it would lose the fast path where xunbusy works regardless of page lock status (the bit before vm_page_xunbusy_hard).

In D4478#99505, @rlibby_gmail.com wrote:
In D4478#99420, @kib wrote:
In D4478#99389, @rlibby_gmail.com wrote:

I do not like the propagation of the locked checks around. Unlocked page may be only passed to vm_page_remove() from vm_page_free_toq(), it seems. Could we e.g. ensure that the page is always locked in free_toq() ?

Unfortuantely I don't think so. It looks like vm_page_free_toq is called with unlocked pages all over the pmap code via vm_page_free{,_zero}.

By 'ensuring', I mean to check the page lock in vm_page_free_toq(), and lock if not. This is a single place to add the check, and free_toq() is already slow and lock-full, so I do not think that the additional lock, otherwise taken in the callees, makes much difference WRT speed.

I wonder if it is possible just to assert in vm_page_free_toq that unmanaged pages are not in a vm object anyway, and then not call vm_page_remove for them. I don't know enough to say.

Unmanaged pages do exist on the object queues, e.g. for OBJT_MGTMDEVICE or OBJT_PHYS pages.

If not, I still suggest to have a function like vm_page_xunbusy_locked(), which does the same action as xunbusy(), but assumes that the page is already locked.

If we can't enforce vm_page_remove is called with locked pages, then I think this would become

if page is locked:
    vm_page_xunbusy_locked()
else:
    vm_page_xunbusy()

and I'm not sure I see this as better. It would still have the lock check and it would lose the fast path where xunbusy works regardless of page lock status (the bit before vm_page_xunbusy_hard).

I'd like to commit this patch. Are there any further comments on it?

sys/vm/vm_fault.c
810–811 ↗(On Diff #11623)

I think that the comment above the "if" expression already explains what we're doing here better than this comment. The only thing that doesn't get mentioned in that comment is the need to rebusy the page.

In D4478#139086, @alc wrote:

I'd like to commit this patch. Are there any further comments on it?

No comments, please proceed.

This revision was automatically updated to reflect the committed changes.