Page MenuHomeFreeBSD

Add CSV output to gstat via -C flag
ClosedPublic

Authored by nick_princi.pe on Jul 5 2018, 7:57 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Apr 27, 2:21 PM
Unknown Object (File)
Sat, Apr 27, 2:20 PM
Unknown Object (File)
Sat, Apr 27, 2:20 PM
Unknown Object (File)
Sat, Apr 27, 2:20 PM
Unknown Object (File)
Sat, Apr 27, 2:20 PM
Unknown Object (File)
Sat, Apr 27, 1:27 PM
Unknown Object (File)
Sat, Apr 27, 12:56 PM
Unknown Object (File)
Mar 8 2024, 5:02 AM
Subscribers

Details

Summary

Add a -C option, similar to -B, that allows gstat to produce basic CSV output with absolute timestamps (ISO 8601, nearly.) Multiple devices are handled by way of a single-pivot CSV table with duplicated timestamps for each object output. I find this extremely handy for logging gstat output for medium-term timeframes of several hours to one day - in-between interactive debugging and long-term collectd-type archiving.

Test Plan

Test gstat interactively with all options that existed prior to this patch and ensure output is sane. Focus should be on -B and -b due to logic changes. Also, -c should be looked at, as object names are now stored in a string before being printed, instead of being generated immediately in a print statement. Further, test all options combinations with -C to ensure CSV output looks correct. A known, and documented, issue, is that sub-1s sample intervals work correctly, but are not especially useful due to the timestamp granularity being 1s.

I have run through testing and have been using this patch for several days, and merged this patch (as-is) into trueos/trueos.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

I don't know what you did, but somehow you created this review as a bare diff without any context information. Did you upload the raw diff into reviews.freebsd.org or something? If you use the php5-arcanist command line tool instead, the review will include unchanged portions of the file. It's much easier to review that way.

Attempting to fix mangled diff using arcanist

I don't know what you did, but somehow you created this review as a bare diff without any context information. Did you upload the raw diff into reviews.freebsd.org or something? If you use the php5-arcanist command line tool instead, the review will include unchanged portions of the file. It's much easier to review that way.

Sorry about that. I believe I have fixed it with the updated diff from arcanist.

This change looks good. But I have a few overall concerns:

  • ISO-8601 allows fractional seconds, at least according to Wikipedia. That would improve -C's accuracy and usefulness.
  • The existing logic is fairly ugly. It would be great to do a clean up pass. In particular I think all of the PRINTMSGs could be reduced to a few loops, if you define a format string array and an enabled array.
  • What happens if a new device arrives or one goes away while gstat -C is running?
usr.sbin/gstat/gstat.c
268 ↗(On Diff #44942)

Making two separate printf calls is an inefficient way of line-splitting a string literal. Would you be interested in fixing this pattern as a separate commit?

339 ↗(On Diff #44942)

style(9) says multiline comments get the "/*" and "*/" on their own lines.

459 ↗(On Diff #44942)

This and several other lines exceed 80 characters per line.

besides what @asomers has pointed out already, all those style(9) must be fixed, also it is necessary to bump the date on the manpage.
Overall, seems very good!!

This change looks good. But I have a few overall concerns:

  • ISO-8601 allows fractional seconds, at least according to Wikipedia. That would improve -C's accuracy and usefulness.

Sure. I have no problem with that. How would you feel if I omit factional seconds if -I is >=1s, though?

  • The existing logic is fairly ugly. It would be great to do a clean up pass. In particular I think all of the PRINTMSGs could be reduced to a few loops, if you define a format string array and an enabled array.

Agree this shouldn't be too hard to clean up a bit. I can work on this a bit while I'm on the road for the next few weeks. This should address the only inline comment that is not style-related as well.

  • What happens if a new device arrives or one goes away while gstat -C is running?

I didn't try to be more or less clever than existing -b/-B modes. I actually can't remember the behavior of devices actually attaching or detaching, but if you specify the -a flag, devices will come and go from the CSV output, just as with the existing output modes. Actually, -a with -C will cause samples to be skipped entirely if there is no activity on the system.

Apologies for style(9) violations, I have them fixed in my local branch. I will update the diff when I've cleaned up the logic a bit.

This change looks good. But I have a few overall concerns:

  • ISO-8601 allows fractional seconds, at least according to Wikipedia. That would improve -C's accuracy and usefulness.

Sure. I have no problem with that. How would you feel if I omit factional seconds if -I is >=1s, though?

Not good. At 1s or even 5s intervals, the rounding errors could be significant, since sleep doesn't wakeup at a precise time. And since CSV output isn't intended for human viewing, what's the harm in a few extra digits?

  • The existing logic is fairly ugly. It would be great to do a clean up pass. In particular I think all of the PRINTMSGs could be reduced to a few loops, if you define a format string array and an enabled array.

Agree this shouldn't be too hard to clean up a bit. I can work on this a bit while I'm on the road for the next few weeks. This should address the only inline comment that is not style-related as well.

  • What happens if a new device arrives or one goes away while gstat -C is running?

I didn't try to be more or less clever than existing -b/-B modes. I actually can't remember the behavior of devices actually attaching or detaching, but if you specify the -a flag, devices will come and go from the CSV output, just as with the existing output modes. Actually, -a with -C will cause samples to be skipped entirely if there is no activity on the system.

Ok, that should be good.

Apologies for style(9) violations, I have them fixed in my local branch. I will update the diff when I've cleaned up the logic a bit.

bcr added a subscriber: bcr.

There is a trailing whitespace after the "but" in the man page.
You also need to bump the .Dd to the date of the commit.
Other than that, it's good to go.

This revision is now accepted and ready to land.Aug 4 2018, 3:33 PM

I've made a few improvements based upon reviewer's feedback:

  • Add microseconds to timestamp, fix lines longer than 80 chars, and update date in man page
  • Update date in man page and remove unnecessary section about timestamp granularity, now that more timestamp granularity has been added

I agree with feedback to fix up the logic that builds the output, but I don't think I'll be able to get to that quickly.

This revision now requires review to proceed.Aug 16 2018, 10:48 PM

Generally, I like it. I've wanted a feature like this. I haven't looked at every single detail and just commented on what in a quick pass stood out.

usr.sbin/gstat/gstat.c
229 ↗(On Diff #46805)

Your summary says 'microseconds' but this is 'nanoseconds'.

493 ↗(On Diff #46805)

what's 80 and 50 here? A #define would make it clear they are percentages...

This revision is now accepted and ready to land.Aug 16 2018, 11:00 PM

Some minor style(9), but overall very good. Thank you!

usr.sbin/gstat/gstat.c
56 ↗(On Diff #46805)

You don't need that extra '\' but also it is not harmful!

238 ↗(On Diff #46805)

Is there an extra space after 'dt,' ?

275 ↗(On Diff #46805)

You shall put that else together with the bracket '} else'

289 ↗(On Diff #46805)

Same here '} else {'

298 ↗(On Diff #46805)

also here!

448 ↗(On Diff #46805)

Also here.

  • Addressed style(9) violations
  • Moved percent busy color thresholds to #define per imp@
This revision now requires review to proceed.Aug 17 2018, 12:55 AM
nick_princi.pe added inline comments.
usr.sbin/gstat/gstat.c
229 ↗(On Diff #46805)

Yes, my mistake - it is indeed nanoseconds. I believe the output is correct, in spite of my error in the description:

[nap@yurikamome ~/git/freebsd/usr.sbin/gstat]% ./gstat -C -I 500ms -pods
timestamp,name,q-depth,total_ops/s,read/s,read_sz-KiB,read-KiB/s,ms/read,write/s,write_sz-KiB,write-KiB/s,ms/write,delete/s,delete-sz-KiB,delete-KiB/s,ms/delete,other/s,ms/other,%busy
2018-08-16 20:25:35.624031084,ada0,0,0,0,0,0,0.0,0,0,0,0.0,0,0,0,0.0,0,0.0,0.0
2018-08-16 20:25:36.127696758,ada0,0,0,0,0,0,0.0,0,0,0,0.0,0,0,0,0.0,0,0.0,0.0
2018-08-16 20:25:36.633715581,ada0,0,16,0,0,0,0.0,8,4,32,0.1,0,0,0,0.0,8,0.9,0.8
238 ↗(On Diff #46805)

Yes, deleted

493 ↗(On Diff #46805)

those are the color thresholds for %busy, I've moved them to #define

268 ↗(On Diff #44942)

I agree this should be addressed, but I would prefer to defer that refactoring to another commit.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 21 2018, 11:23 AM
This revision was automatically updated to reflect the committed changes.