Page MenuHomeFreeBSD

Speed up geom_stats_resync in the presence of many devices
ClosedPublic

Authored by asomers on Feb 27 2021, 4:05 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Apr 17, 2:47 PM
Unknown Object (File)
Wed, Apr 17, 2:13 PM
Unknown Object (File)
Thu, Apr 11, 11:32 AM
Unknown Object (File)
Fri, Mar 22, 5:38 PM
Unknown Object (File)
Fri, Mar 22, 5:38 PM
Unknown Object (File)
Mar 8 2024, 7:40 AM
Unknown Object (File)
Feb 10 2024, 5:59 AM
Unknown Object (File)
Feb 1 2024, 3:59 AM
Subscribers

Details

Summary

The old code had a O(n) loop, where n is the size of /dev/devstat.
Multiply that by another O(n) loop in devstat_mmap for a total of
O(n^2).

This change adds DIOCGMEDIASIZE support to /dev/devstat so userland can
quickly determine the right amount of memory to map, eliminating the
O(n) loop in userland.

This change decreases the time to run "gstat -bI0.001" with 16,384 md
devices from 29.7s to 4.2s.

MFC after: 2 weeks
Sponsored by: Axcient

[skip ci] fix a typo in a comment in mdconfig.c Sponsored by: Axcient

Test Plan

Tested with gstat -bI0.001 with up to 16,384 md devices

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

  • Speed up geom_stats_resync in the presence of many devices
  • [skip ci] fix a typo in a comment in mdconfig.c

Looks good to my eye

This revision is now accepted and ready to land.Feb 27 2021, 5:08 PM
lib/libgeom/geom_stats.c
85

I don't very like the use of DIOCGMEDIASIZE here, but I have no better ideas.

91

In the new code you are neither reading nor updating npages variable. You could read npages to skip mmap() call if nothing has changed. And you should update it for other library parts to work correctly.

sys/kern/subr_devstat.c
506

Double semicolon.

lib/libgeom/geom_stats.c
85

I had the same thought... but didn't want to suggest a new one

  • * Fix mav's npages observation
This revision now requires review to proceed.Feb 27 2021, 11:57 PM
lib/libgeom/geom_stats.c
141

I think previous order was intentional to make time more up to date of paging on access take some time.

lib/libgeom/geom_stats.c
141

Interesting idea. Do you have any good idea why there used to be two memset calls?

lib/libgeom/geom_stats.c
141

Not really. Guess some form of swap pager woodoo. Something like first pass triggers swap-in, while second brings up to date TLB caches if not yet, to make following copy as coherent as possible.

lib/libgeom/geom_stats.c
141

So you really think both memsets are required?

lib/libgeom/geom_stats.c
141

And you think it is a copy/paste bug?

lib/libgeom/geom_stats.c
141

Yes, I think it's a copy/paste bug.

mav added inline comments.
lib/libgeom/geom_stats.c
78

I haven't used mmap() that much to know. I am curios whether the lack of munmap() before used some trick not avoid extra actions.

141

I this case I'd say don't fix what's not broken. But whatever you decide. Responsibility is yours. ;)

This revision is now accepted and ready to land.Feb 28 2021, 2:48 AM
lib/libgeom/geom_stats.c
78

I don't know. But the resulted in rapid accumulation of MAP ENTRY memory with every call to geom_stats_resync

141

Even though that code is very old, the original author is still active. So I'll pull it out of this review and take it up separately.