Page MenuHomeFreeBSD

Use unlocked page lookup for inmem() to avoid object lock contention
ClosedPublic

Authored by mlaier on Sep 29 2020, 10:05 PM.

Details

Summary

Sponsored by: Dell EMC Isilon

Test Plan

Significant reduction in vm_object lock contention shown.

Seems to fit with: https://reviews.freebsd.org/D23847

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Could you please upload with extra context? Add -U99999 if you are uploading a raw diff, per https://wiki.freebsd.org/Phabricator .

More context - sorry, rusty fingers :)

sys/kern/vfs_bio.c
3613 ↗(On Diff #77661)

There is a vm_page_grab_pages_unlocked() that handles runs of multiple pages, did you consider using it?

3617 ↗(On Diff #77661)

I think the gotos should be replaced with a plain return statement.

3621 ↗(On Diff #77661)

Note that the shared busy state on its own does not block a concurrent vm_page_invalid() call on the page. I'm not sure if this matters much since in the previous implementation results were stale as soon as the object lock is dropped.

sys/kern/vfs_bio.c
3613 ↗(On Diff #77661)

I have decided against that, because it will block other users for longer - iff somebody wants an exclusive busy on the first page, while we are looking through the results. This way also preserved more of the original code flow.

3617 ↗(On Diff #77661)

I was trying to preserve as much of the original code as possible, but I can rework it.

3621 ↗(On Diff #77661)

Yeah - this interface is completely advisory. My goal was to make it as unimpactful as possible - hence the single page busy above as well...

There is an argument for changing this to to a VM_ALLOC_WIRED above and return the pages back to PQ_ACTIVE (because why would you check inmem if you're not planning to use it) ... but I didn't want to add side-effects.

This is fine.

But since completely advisory (it is used to avoid unneeded read-ahead), you can avoid sbusy as well, I believe. False-negatives should be safe.
You would remove RLOCK as in patch, and use vm_radix_lookup_unlocked. Then after the check for validity you can re-check with vm_radix_lookup_unlocked() that the page is still within object.

In fact I think that it would not give false negatives because the vnode lock is owned.

sys/kern/vfs_bio.c
3617 ↗(On Diff #77661)

The diff is already quite small, I don't think it hurts anything to simplify the function a bit.

3621 ↗(On Diff #77661)

You don't need to wire the page if it is busied. It would be reasonable to call vm_page_reference() before unbusying the page. If the page is already in PQ_ACTIVE, then the extra reference will be counted next time the page is visited. If the reference bit is observed during a PQ_INACTIVE scan, the page will either be requeued or moved to PQ_ACTIVE depending on whether it might be mapped.

What @kib suggested(I think) - also trying to do the cleanup @markj asked for.

markj added inline comments.
sys/kern/vfs_bio.c
3620 ↗(On Diff #77684)

I think this can move to the end of the loop body.

This revision is now accepted and ready to land.Sep 30 2020, 5:54 PM
kib added inline comments.
sys/kern/vfs_bio.c
3589 ↗(On Diff #77684)

Change return type to bool, since you are fixing style ?

3630 ↗(On Diff #77684)

Does it make sense to check for m != n ? You can return true to stop read-ahead as well. We cannot handle ABA there anyway.

sys/kern/vfs_bio.c
3589 ↗(On Diff #77684)

will do

3620 ↗(On Diff #77684)

will do

3630 ↗(On Diff #77684)

This also catches the n == NULL case (e.g. we just lost the page) and we certainly want to do read-ahead then.

This is probably overly paranoid, and I could be convinced to just drop the second lookup altogether.

This revision now requires review to proceed.Oct 1 2020, 7:09 PM

I had committed this prematurely and now reverted.

Building /usr/obj/usr/src/i386.i386/sys/SARAH/vfs_bio.o
--- vfs_bio.o ---
/usr/src/sys/kern/vfs_bio.c:3614:7: error: implicit declaration of function 'vm_page_lookup_unlocked' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
                m = vm_page_lookup_unlocked(obj, OFF_TO_IDX(off + toff));
                    ^
/usr/src/sys/kern/vfs_bio.c:3614:7: note: did you mean 'vm_map_lookup_locked'?
/usr/src/sys/vm/vm_map.h:480:5: note: 'vm_map_lookup_locked' declared here
int vm_map_lookup_locked(vm_map_t *, vm_offset_t, vm_prot_t, vm_map_entry_t *, vm_object_t *,
    ^
/usr/src/sys/kern/vfs_bio.c:3614:5: error: incompatible integer to pointer conversion assigning to 'vm_page_t' (aka 'struct vm_page *') from 'int' [-Werror,-Wint-conversion]
                m = vm_page_lookup_unlocked(obj, OFF_TO_IDX(off + toff));
                  ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/src/sys/kern/vfs_bio.c:3624:5: error: incompatible integer to pointer conversion assigning to 'vm_page_t' (aka 'struct vm_page *') from 'int' [-Werror,-Wint-conversion]
                n = vm_page_lookup_unlocked(obj, OFF_TO_IDX(off + toff));
                  ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 errors generated.
*** [vfs_bio.o] Error code 1
sys/kern/vfs_bio.c
3601 ↗(On Diff #77750)

Ravi Pokala pointed out that this changed to bool but the return values did change to true/false.

bdrewery requested changes to this revision.Oct 1 2020, 8:10 PM
This revision now requires changes to proceed.Oct 1 2020, 8:10 PM
sys/kern/vfs_bio.c
3601 ↗(On Diff #77750)

did NOT

This revision was not accepted when it landed; it landed in state Needs Revision.Oct 1 2020, 8:25 PM
This revision was automatically updated to reflect the committed changes.
sys/kern/vfs_bio.c
3614 ↗(On Diff #77750)

(To be clear, the problem is that there is no vm_page_lookup_unlocked() yet. It should be straightforward to add. I would prefer that to using a naked vm_radix_lookup_unlocked() call here.)