Page MenuHomeFreeBSD

vm_page_insert: use pctrie combined insert/lookup
ClosedPublic

Authored by rlibby on Tue, Jun 4, 7:38 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Jun 16, 6:06 PM
Unknown Object (File)
Fri, Jun 7, 11:07 PM
Unknown Object (File)
Fri, Jun 7, 6:32 PM
Unknown Object (File)
Tue, Jun 4, 9:20 PM
Subscribers

Details

Summary

This reduces work done under vm_page_insert for large objects.

Sponsored by: Dell EMC Isilon

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

rlibby requested review of this revision.Tue, Jun 4, 7:38 PM
sys/vm/vm_page.c
1423

If you can infer the value of lookup by examining mpred, why have parameter lookup?

1425

Time to return true/false, rather than 1/0?

sys/vm/vm_page.c
1423

Alas, you can't. Maybe I can rewrite the comment more clearly.

Without lookup, mpred=NULL might mean the lookup=true case, or it might mean that the caller of vm_page_insert_after already did a lookup and found no page, in which case you don't want to duplicate the lookup.

Besides that, having lookup allows the optimizer to inline the procedure and then call only one or the other insert procedure, without a run time test. Probably not a huge deal, but it is a benefit vs say using some other otherwise illegal value for mpred.

I do recognize that it is somewhat ugly.

1425

I don't disagree but I'm not eager to do that here, at least not for vm_page_insert, as that procedure is exported and used by drivers, etc. However, looking at callers, several appear to be missing return value checks and may have bugs. An audit may be needed.

I could do the int->bool conversion just for the helper and vm_page_insert_after, but then I'd need a bool->int conversion for vm_page_insert if I left that API alone. Maybe it's better to leave it to a later conversion cleanup.

This revision is now accepted and ready to land.Wed, Jun 5, 3:24 AM
markj added inline comments.
sys/vm/vm_page.c
1420

markj feedback: comment typo

This revision now requires review to proceed.Wed, Jun 5, 2:39 PM

Mark vm_page_insert_lookup as __always_inline, and reword the comment.

Checking the generated code, the plain inline keyword was not enough to
convince clang, which had only eliminated the vm_page_insert_after
procedure (now a thin wrapper with this patch). Now with __always_inline
it does not generate a separate vm_page_insert_lookup but generates a
vm_page_insert and vm_page_insert_after with the expected code elimination.

gcc13 inlines everything either way, generating vm_page_insert and
inlining vm_page_insert_after/vm_page_insert_lookup into the various
callers in vm_page.c.

Mention this in the comment and also tweak the comment to address
dougm's question.

markj added inline comments.
sys/vm/vm_page.c
1454

Perhaps annotate this branch with __predict_false(). That'll only happen upon a memory allocation error I believe.

This revision is now accepted and ready to land.Wed, Jun 5, 4:41 PM
sys/vm/vm_page.c
1454

Yes, that's right.

In sys/sys/pctrie.h we do now __predict_false() the ENOMEM cases. The compiler may be smart enough for that to be enough. Or maybe not. Checking clang's output, it was enough this time (jumps to near end of function). Should I add the prediction anyway?

sys/vm/vm_page.c
1454

Ah right, I forgot that these functions are inline now.

I don't have strong feeling either way. I'd probably add the annotation anyway but it's fine to omit it, as you note.

markj feedback: predict success

This revision now requires review to proceed.Wed, Jun 5, 5:20 PM
This revision is now accepted and ready to land.Wed, Jun 5, 5:21 PM

In principle, this is fine, but in practice. we mostly call vm_page_insert_after(). So, the impact will be limited. That said, I'm still excited by where I think this is headed.

Have you thought about how the repeated walk will be eliminated from vm_page_alloc() (and vm_page_insert_after()), where we need mpred before the radix tree insertion to lookup a reservation? I can imagine a function akin to vm_radix_insert_lookup_lt() that doesn't take a vm_page pointer as an argument but instead returns a pointer to where that vm_page pointer should be stored in the radix tree once we know what the allocated vm_page is.

Better yet. this new function would return some sort of "iterator" that would enable vm_page_grab_pages() to insert multiple pages without repeating the radix tree walk on every insertion.

In D45486#1038012, @alc wrote:

In principle, this is fine, but in practice. we mostly call vm_page_insert_after(). So, the impact will be limited. That said, I'm still excited by where I think this is headed.

Yes, at this point this is opportunistic given the pctrie combined lookup/insert, which I was originally interested in for the getblk/bgetvp patch (D45395).

Have you thought about how the repeated walk will be eliminated from vm_page_alloc() (and vm_page_insert_after()), where we need mpred before the radix tree insertion to lookup a reservation? I can imagine a function akin to vm_radix_insert_lookup_lt() that doesn't take a vm_page pointer as an argument but instead returns a pointer to where that vm_page pointer should be stored in the radix tree once we know what the allocated vm_page is.

Better yet. this new function would return some sort of "iterator" that would enable vm_page_grab_pages() to insert multiple pages without repeating the radix tree walk on every insertion.

Thanks for the ideas. Honestly, I don't have a concrete plan, but I am thinking about it. You probably saw related comments from markj and jeff in D45396. An iterator that may also be able to handle batch inserts efficiently is an interesting idea.