Page MenuHomeFreeBSD

Restructure zalloc for better code generation.
ClosedPublic

Authored by jeff on Thu, Feb 13, 9:15 PM.

Details

Summary

Attempt to silence the mjg assembly linter warnings about excessive branches.

clang produces really ugly code with the loop so I just moved it out. I think the code looks nice enough that I don't object. I was unable to measure a real difference but I think it makes it clearer to the reader what the full fast path is and the assembly looked substantially nicer.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jeff created this revision.Thu, Feb 13, 9:15 PM
jeff edited the summary of this revision. (Show Details)Thu, Feb 13, 9:17 PM
jeff added reviewers: mjg, markj, rlibby.
jeff set the repository for this revision to rS FreeBSD src repository.
jeff added inline comments.Thu, Feb 13, 9:19 PM
sys/vm/uma_core.c
3120 ↗(On Diff #68282)

I guess this should be cache_alloc_fast()

3134 ↗(On Diff #68282)

Maybe cache_alloc_retry or cache_alloc_loop

4334 ↗(On Diff #68282)

This improves zfree which was pulling in the whole thing.

mjg added a comment.EditedThu, Feb 13, 9:27 PM

This keeps the avoidable jump forward over memset in the fast path. Other than that I don't have comments.

With an observation that M_ZERO is rarely passed (and in fact should be avoided by callers unless the object is big) and that memset returns the passed buffer, this can be changed to tail call:

diff --git a/sys/vm/uma_core.c b/sys/vm/uma_core.c
index 8618a620ef7..b94994f1c94 100644
--- a/sys/vm/uma_core.c
+++ b/sys/vm/uma_core.c
@@ -3004,8 +3004,8 @@ item_ctor(uma_zone_t zone, int uz_flags, int size, void *udata, int flags,
        if (!skipdbg)
                uma_dbg_alloc(zone, NULL, item);
 #endif
-       if (flags & M_ZERO)
-               bzero(item, size);
+       if (__predict_false(flags & M_ZERO))
+               return (memset(item, 0, size));
 
        return (item);
 }

Note bzero is in fact memset.

markj accepted this revision.Thu, Feb 13, 9:30 PM
markj added inline comments.
sys/vm/uma_core.c
4334 ↗(On Diff #68282)

Why does that matter? In zfree, isn't the zone_free_item() call out of the critical path?

This revision is now accepted and ready to land.Thu, Feb 13, 9:30 PM
jeff added inline comments.Fri, Feb 14, 7:48 AM
sys/vm/uma_core.c
4334 ↗(On Diff #68282)

uma_zfree_arg calls it and uma was inlining it.

markj added inline comments.Fri, Feb 14, 3:25 PM
sys/vm/uma_core.c
4334 ↗(On Diff #68282)

Right, and that call follows the code which looks for an item in the per-CPU buckets and returns it if one is found. So I can't see why the inlining should have any effect on performance.