Page MenuHomeFreeBSD

Add a sysctl tree for UMA.
ClosedPublic

Authored by jeff on Nov 26 2019, 12:45 AM.
Tags
None
Referenced Files
F80081634: D22554.id64874.diff
Wed, Mar 27, 3:43 PM
Unknown Object (File)
Thu, Mar 7, 11:45 PM
Unknown Object (File)
Dec 20 2023, 4:32 AM
Unknown Object (File)
Nov 9 2023, 9:01 AM
Unknown Object (File)
Nov 6 2023, 2:39 PM
Unknown Object (File)
Nov 6 2023, 11:54 AM
Unknown Object (File)
Nov 6 2023, 10:42 AM
Unknown Object (File)
Oct 8 2023, 7:57 AM
Subscribers

Details

Summary

This has been sorely needed for a while. I have added a sysctl oid for every zone. Here is an example:

vm.uma.RADIX_NODE.stats.xdomain: 0
vm.uma.RADIX_NODE.stats.fails: 0
vm.uma.RADIX_NODE.stats.frees: 3404547
vm.uma.RADIX_NODE.stats.allocs: 3488279
vm.uma.RADIX_NODE.stats.current: 75216
vm.uma.RADIX_NODE.domain.0.wss: 233
vm.uma.RADIX_NODE.domain.0.imin: 22011
vm.uma.RADIX_NODE.domain.0.imax: 22011
vm.uma.RADIX_NODE.domain.0.nitems: 22011
vm.uma.RADIX_NODE.limit.sleeps: 0
vm.uma.RADIX_NODE.limit.sleepers: 0
vm.uma.RADIX_NODE.limit.max_items: 0
vm.uma.RADIX_NODE.limit.items: 0
vm.uma.RADIX_NODE.keg.free: 8
vm.uma.RADIX_NODE.keg.pages: 3925
vm.uma.RADIX_NODE.keg.align: 1
vm.uma.RADIX_NODE.keg.ipers: 27
vm.uma.RADIX_NODE.keg.ppera: 1
vm.uma.RADIX_NODE.keg.rsize: 144
vm.uma.RADIX_NODE.keg.name: RADIX NODE
vm.uma.RADIX_NODE.count_max: 253
vm.uma.RADIX_NODE.count: 203
vm.uma.RADIX_NODE.flags: 2147745792
vm.uma.RADIX_NODE.size: 144

Here's the keg portion of z ZSECOND zone
vm.uma.mbuf_packet.keg.free: 1
vm.uma.mbuf_packet.keg.pages: 1906
vm.uma.mbuf_packet.keg.align: 255
vm.uma.mbuf_packet.keg.ipers: 15
vm.uma.mbuf_packet.keg.ppera: 1
vm.uma.mbuf_packet.keg.rsize: 256
vm.uma.mbuf_packet.keg.name: mbuf

Both mbuf and mbuf_packet will print the same stats for the "mbuf" keg.

I had to preprocess and dup the name because we make zones with duplicate names and use illegal characters for sysctl. This means that it is really a human targeted interface.

I have not yet added any writable controls. Presumably you could update limit policies, bucket sizes, or whatever you like for experimentation. I also think we need some more stats. At one point I had counters for fast path vs slow path allocs etc.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jeff added reviewers: markj, kib, glebius, gallatin.
jeff set the repository for this revision to rS FreeBSD src repository - subversion.
sys/vm/uma_core.c
1852 ↗(On Diff #64874)

I could break this up into multiple functions but it would serve no real purpose other than style. It is inherently very linear and verbose.

1870 ↗(On Diff #64874)

I think this needs to be +2. +1 for underscore and + 1 for null terminator.

sys/vm/uma_int.h
342 ↗(On Diff #64874)

I renamed this because it confused even me.

sys/vm/uma_core.c
263 ↗(On Diff #64874)

It might have been nicer to have both, avoiding ugly unused params.

static void zone_foreach(void (*zfunc)(uma_zone_t));
static void zone_foreach_arg(void (*zfunc)(uma_zone_t, void *), void *);
616 ↗(On Diff #64874)

May want to add __unused ? Etc below.

1875–1883 ↗(On Diff #64874)

This scheme has a couple problems:

  • The name count algorithm doesn't munge the names, so we may not have detected a collision earlier, but collide here (e.g. "foo bar" and "foo_bar" collide).
  • The counting also doesn't strictly work:
z1 = zcreate("foo");   // "foo"
z2 = zcreate("foo");   // "foo_2"
zdestroy(z1);
z3 = zcreate("foo");   // "foo_2" (undetected collision)
z4 = zcreate("foo_2"); // "foo_2" (munging again)

These can lead to trying to reuse the sysctl name, which then would lead to a crash in this code.

We could consider just setting namecnt here in zone_alloc_sysctl according to the first name we are able to successfully insert into the tree. It doesn't look like the sysctl API is very hepful for us in doing that though. We could use sysctl_add_oid but we'd want to suppress the sysctl_warn_reuse spew (maybe with a CTLFLAG?).

sys/vm/uma_core.c
1875–1883 ↗(On Diff #64874)

sysctl won't crash if there is a duplicate. It will print a warning about a re-used entry. I don't think we need to detect super pathological cases. You are right that I should be using count - 1.

sys/vm/uma_core.c
1875–1883 ↗(On Diff #64874)

Yes, sysctl won't crash on collision, but it'll return NULL, which we don't check for here.

I agree we don't need to chase down pathological cases, just make them harmless.

sys/vm/uma_core.c
1875–1883 ↗(On Diff #64874)

For re-used nodes sysctl returns the original oid. It just doesn't return it for leafs. We never use the return value from a leaf oid add.

I originally tested this without any collision detection so I'm fairly sure it doesn't panic.

Does this essentially obsoletes vm.zone_stats ? Wouldn't it be better to extend that interface and vmstat -z ?

Sysctl will allow us to make changes at runtime. For example limits. It is quicker to modify and doesn't create ABI issues. I view this as more of a debugging tool and vmstat as more for system administrators.

sys/vm/uma_core.c
1858 ↗(On Diff #64874)

const

1875–1883 ↗(On Diff #64874)

Yep, I'm on board now.

1902–1903 ↗(On Diff #64874)

Hmm, it's a union, should this be

if (zone->uz_lockptr != &zone->uz_lock && zone->uz_keg != NULL) {
        keg = zone->uz_keg;

?

1955–1956 ↗(On Diff #64874)

s/uz_domain[0]/uz_domain[i]/

1967 ↗(On Diff #64874)

s/minum/minimum/

2029–2045 ↗(On Diff #64874)

Just a thought. We don't NULL uz_ctlname/uz_oid. That's fine, we do bzero them. But should we, for consistency? I expect the compiler to remove the stores of zero after bzero anyway.

2220–2230 ↗(On Diff #64874)

We ought to sysctl_remove_oid earlier, to avoid a race on teardown. The sysctl handlers inspect the keg and the counters.

3801–3806 ↗(On Diff #64874)

Here and below, do we actually need the ZONE_LOCK() for the counter fetch? And, since we are not actually inspecting the uz_cpu cache buckets, do we really need it for that either?

3851–3857 ↗(On Diff #64874)

s/uz_allocs/uz_frees/

sys/vm/uma_int.h
342 ↗(On Diff #64874)

Unfortunately there's still a reference in userspace to clean up... tools/tools/umastat/umastat.c.

Address review feedback. Remove related dead code.

Updates look good to me.

sys/vm/uma_core.c
1870 ↗(On Diff #64874)

The string literal also has a null terminator, so just + 1 for the underscore is fine. (I guess you determined this too since no change was made here.)

This revision is now accepted and ready to land.Nov 27 2019, 9:33 PM