Page MenuHomeFreeBSD

Add a sysctl tree for UMA.
ClosedPublic

Authored by jeff on Nov 26 2019, 12:45 AM.
Tags
None
Referenced Files
F106195116: D22554.diff
Thu, Dec 26, 11:22 PM
Unknown Object (File)
Nov 22 2024, 6:58 PM
Unknown Object (File)
Nov 16 2024, 7:25 AM
Unknown Object (File)
Nov 15 2024, 6:37 PM
Unknown Object (File)
Nov 5 2024, 2:18 PM
Unknown Object (File)
Oct 9 2024, 10:40 PM
Unknown Object (File)
Oct 6 2024, 8:20 PM
Unknown Object (File)
Oct 6 2024, 3:57 AM
Subscribers

Details

Summary

This has been sorely needed for a while. I have added a sysctl oid for every zone. Here is an example:

vm.uma.RADIX_NODE.stats.xdomain: 0
vm.uma.RADIX_NODE.stats.fails: 0
vm.uma.RADIX_NODE.stats.frees: 3404547
vm.uma.RADIX_NODE.stats.allocs: 3488279
vm.uma.RADIX_NODE.stats.current: 75216
vm.uma.RADIX_NODE.domain.0.wss: 233
vm.uma.RADIX_NODE.domain.0.imin: 22011
vm.uma.RADIX_NODE.domain.0.imax: 22011
vm.uma.RADIX_NODE.domain.0.nitems: 22011
vm.uma.RADIX_NODE.limit.sleeps: 0
vm.uma.RADIX_NODE.limit.sleepers: 0
vm.uma.RADIX_NODE.limit.max_items: 0
vm.uma.RADIX_NODE.limit.items: 0
vm.uma.RADIX_NODE.keg.free: 8
vm.uma.RADIX_NODE.keg.pages: 3925
vm.uma.RADIX_NODE.keg.align: 1
vm.uma.RADIX_NODE.keg.ipers: 27
vm.uma.RADIX_NODE.keg.ppera: 1
vm.uma.RADIX_NODE.keg.rsize: 144
vm.uma.RADIX_NODE.keg.name: RADIX NODE
vm.uma.RADIX_NODE.count_max: 253
vm.uma.RADIX_NODE.count: 203
vm.uma.RADIX_NODE.flags: 2147745792
vm.uma.RADIX_NODE.size: 144

Here's the keg portion of z ZSECOND zone
vm.uma.mbuf_packet.keg.free: 1
vm.uma.mbuf_packet.keg.pages: 1906
vm.uma.mbuf_packet.keg.align: 255
vm.uma.mbuf_packet.keg.ipers: 15
vm.uma.mbuf_packet.keg.ppera: 1
vm.uma.mbuf_packet.keg.rsize: 256
vm.uma.mbuf_packet.keg.name: mbuf

Both mbuf and mbuf_packet will print the same stats for the "mbuf" keg.

I had to preprocess and dup the name because we make zones with duplicate names and use illegal characters for sysctl. This means that it is really a human targeted interface.

I have not yet added any writable controls. Presumably you could update limit policies, bucket sizes, or whatever you like for experimentation. I also think we need some more stats. At one point I had counters for fast path vs slow path allocs etc.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 27821
Build 26000: arc lint + arc unit

Event Timeline

jeff added reviewers: markj, kib, glebius, gallatin.
jeff set the repository for this revision to rS FreeBSD src repository - subversion.
sys/vm/uma_core.c
1852

I could break this up into multiple functions but it would serve no real purpose other than style. It is inherently very linear and verbose.

1870

I think this needs to be +2. +1 for underscore and + 1 for null terminator.

sys/vm/uma_int.h
342

I renamed this because it confused even me.

sys/vm/uma_core.c
263–264

It might have been nicer to have both, avoiding ugly unused params.

static void zone_foreach(void (*zfunc)(uma_zone_t));
static void zone_foreach_arg(void (*zfunc)(uma_zone_t, void *), void *);
614

May want to add __unused ? Etc below.

1875–1883

This scheme has a couple problems:

  • The name count algorithm doesn't munge the names, so we may not have detected a collision earlier, but collide here (e.g. "foo bar" and "foo_bar" collide).
  • The counting also doesn't strictly work:
z1 = zcreate("foo");   // "foo"
z2 = zcreate("foo");   // "foo_2"
zdestroy(z1);
z3 = zcreate("foo");   // "foo_2" (undetected collision)
z4 = zcreate("foo_2"); // "foo_2" (munging again)

These can lead to trying to reuse the sysctl name, which then would lead to a crash in this code.

We could consider just setting namecnt here in zone_alloc_sysctl according to the first name we are able to successfully insert into the tree. It doesn't look like the sysctl API is very hepful for us in doing that though. We could use sysctl_add_oid but we'd want to suppress the sysctl_warn_reuse spew (maybe with a CTLFLAG?).

sys/vm/uma_core.c
1875–1883

sysctl won't crash if there is a duplicate. It will print a warning about a re-used entry. I don't think we need to detect super pathological cases. You are right that I should be using count - 1.

sys/vm/uma_core.c
1875–1883

Yes, sysctl won't crash on collision, but it'll return NULL, which we don't check for here.

I agree we don't need to chase down pathological cases, just make them harmless.

sys/vm/uma_core.c
1875–1883

For re-used nodes sysctl returns the original oid. It just doesn't return it for leafs. We never use the return value from a leaf oid add.

I originally tested this without any collision detection so I'm fairly sure it doesn't panic.

Does this essentially obsoletes vm.zone_stats ? Wouldn't it be better to extend that interface and vmstat -z ?

Sysctl will allow us to make changes at runtime. For example limits. It is quicker to modify and doesn't create ABI issues. I view this as more of a debugging tool and vmstat as more for system administrators.

sys/vm/uma_core.c
1858

const

1875–1883

Yep, I'm on board now.

1902–1903

Hmm, it's a union, should this be

if (zone->uz_lockptr != &zone->uz_lock && zone->uz_keg != NULL) {
        keg = zone->uz_keg;

?

1955–1956

s/uz_domain[0]/uz_domain[i]/

1967

s/minum/minimum/

2029–2045

Just a thought. We don't NULL uz_ctlname/uz_oid. That's fine, we do bzero them. But should we, for consistency? I expect the compiler to remove the stores of zero after bzero anyway.

2222–2233

We ought to sysctl_remove_oid earlier, to avoid a race on teardown. The sysctl handlers inspect the keg and the counters.

3816–3821

Here and below, do we actually need the ZONE_LOCK() for the counter fetch? And, since we are not actually inspecting the uz_cpu cache buckets, do we really need it for that either?

3866–3872

s/uz_allocs/uz_frees/

sys/vm/uma_int.h
342

Unfortunately there's still a reference in userspace to clean up... tools/tools/umastat/umastat.c.

Address review feedback. Remove related dead code.

Updates look good to me.

sys/vm/uma_core.c
1870

The string literal also has a null terminator, so just + 1 for the underscore is fine. (I guess you determined this too since no change was made here.)

This revision is now accepted and ready to land.Nov 27 2019, 9:33 PM