Page MenuHomeFreeBSD

Restructure zalloc for better code generation.
ClosedPublic

Authored by jeff on Feb 13 2020, 9:15 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Nov 18, 7:52 AM
Unknown Object (File)
Tue, Oct 29, 5:36 AM
Unknown Object (File)
Oct 4 2024, 11:44 PM
Unknown Object (File)
Oct 2 2024, 7:37 PM
Unknown Object (File)
Sep 30 2024, 9:43 AM
Unknown Object (File)
Sep 30 2024, 7:15 AM
Unknown Object (File)
Sep 27 2024, 5:16 PM
Unknown Object (File)
Sep 23 2024, 7:41 PM
Subscribers

Details

Summary

Attempt to silence the mjg assembly linter warnings about excessive branches.

clang produces really ugly code with the loop so I just moved it out. I think the code looks nice enough that I don't object. I was unable to measure a real difference but I think it makes it clearer to the reader what the full fast path is and the assembly looked substantially nicer.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jeff added reviewers: mjg, markj, rlibby.
jeff set the repository for this revision to rS FreeBSD src repository - subversion.
sys/vm/uma_core.c
3120 ↗(On Diff #68282)

I guess this should be cache_alloc_fast()

3134 ↗(On Diff #68282)

Maybe cache_alloc_retry or cache_alloc_loop

4334 ↗(On Diff #68282)

This improves zfree which was pulling in the whole thing.

This keeps the avoidable jump forward over memset in the fast path. Other than that I don't have comments.

With an observation that M_ZERO is rarely passed (and in fact should be avoided by callers unless the object is big) and that memset returns the passed buffer, this can be changed to tail call:

diff --git a/sys/vm/uma_core.c b/sys/vm/uma_core.c
index 8618a620ef7..b94994f1c94 100644
--- a/sys/vm/uma_core.c
+++ b/sys/vm/uma_core.c
@@ -3004,8 +3004,8 @@ item_ctor(uma_zone_t zone, int uz_flags, int size, void *udata, int flags,
        if (!skipdbg)
                uma_dbg_alloc(zone, NULL, item);
 #endif
-       if (flags & M_ZERO)
-               bzero(item, size);
+       if (__predict_false(flags & M_ZERO))
+               return (memset(item, 0, size));
 
        return (item);
 }

Note bzero is in fact memset.

markj added inline comments.
sys/vm/uma_core.c
4334 ↗(On Diff #68282)

Why does that matter? In zfree, isn't the zone_free_item() call out of the critical path?

This revision is now accepted and ready to land.Feb 13 2020, 9:30 PM
sys/vm/uma_core.c
4334 ↗(On Diff #68282)

uma_zfree_arg calls it and uma was inlining it.

sys/vm/uma_core.c
4334 ↗(On Diff #68282)

Right, and that call follows the code which looks for an item in the per-CPU buckets and returns it if one is found. So I can't see why the inlining should have any effect on performance.