Page MenuHomeFreeBSD

Add BUF_TRACKING and FULL_BUF_TRACKING buffer debugging code
ClosedPublic

Authored by cem on Oct 28 2016, 5:17 AM.
Tags
None
Referenced Files
F103287803: D8366.diff
Sat, Nov 23, 2:35 AM
Unknown Object (File)
Tue, Oct 29, 10:31 PM
Unknown Object (File)
Oct 20 2024, 1:19 AM
Unknown Object (File)
Oct 19 2024, 11:02 PM
Unknown Object (File)
Oct 1 2024, 10:09 PM
Unknown Object (File)
Sep 27 2024, 1:06 PM
Unknown Object (File)
Sep 26 2024, 9:08 PM
Unknown Object (File)
Sep 22 2024, 10:36 PM
Subscribers

Details

Summary

Upstream the BUF_TRACKING and FULL_BUF_TRACKING buffer debugging code.
This can be handy in tracking down what code touched hung bios and bufs
last. The full history is especially useful, but adds enough bloat that
it shouldn't be enabled in release builds.

Function names (or arbitrary string constants) are tracked in a
fixed-size ring in bufs. Bios gain a pointer to the upper buf for
tracking. SCSI CCBs gain a pointer to the upper bio for tracking.

Test Plan

We use this at Isilon to debug buf/bio timeout issues, etc.

Not sure who to tag for review on this. Feel free to add appropriate
reviewers.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

cem retitled this revision from to Add BUF_TRACKING and FULL_BUF_TRACKING buffer debugging code.
cem updated this object.
cem edited the test plan for this revision. (Show Details)
cem added reviewers: jhb, emaste, markj.

Isn't it unsafe to assume that the layouts of struct bio and struct buf aren't part of the KBI? If so, I'd advocate for keeping BUF_TRACKING completely disabled in GENERIC: it's quite a bit less useful than FULL_BUF_TRACKING in my experience anyway.

sys/cam/cam_xpt.c
5199 ↗(On Diff #21745)

Can you get away without the #ifdef if you make biotrack() into a function as I suggested elsewhere?

sys/kern/vfs_bio.c
4705 ↗(On Diff #21745)

Perhaps have

#if defined(FULL_BUF_TRACKING)
...
#elif defined(BUF_TRACKING)
...
#endif

?

That way FULL_BUF_TRACKING doesn't depend on BUF_TRACKING.

sys/sys/bio.h
149 ↗(On Diff #21745)

There should be a newline after the prototype.

158 ↗(On Diff #21745)

Hm, perhaps have biotrack be an empty inline function with the correct parameters? Then it's harder to break the BUF_TRACKING build if one only tests builds without BUF_TRACKING defined.

sys/sys/buf.h
446 ↗(On Diff #21745)

Style is to have only a single space after "#ifdef".

In D8366#174205, @markj wrote:

Isn't it unsafe to assume that the layouts of struct bio and struct buf aren't part of the KBI? If so, I'd advocate for keeping BUF_TRACKING completely disabled in GENERIC: it's quite a bit less useful than FULL_BUF_TRACKING in my experience anyway.

I'd be okay with moving it from the 'always on' section to the 'disable on stable branches' section (and also adding nooptions BUF_TRACKING to GENERIC-NODEBUG). Is that ok?

sys/cam/cam_xpt.c
5199 ↗(On Diff #21745)

No, I don't think so. biotrack() is a function. The problem is that csio->bio doesn't exist without the BUF_TRACKING option.

sys/kern/vfs_bio.c
4705 ↗(On Diff #21745)

That's fine with me. It would require a little reworking of the patch.

sys/sys/bio.h
149 ↗(On Diff #21745)

Ok.

158 ↗(On Diff #21745)

Sure.

sys/sys/buf.h
446 ↗(On Diff #21745)

Ok.

In D8366#174214, @cem wrote:
In D8366#174205, @markj wrote:

Isn't it unsafe to assume that the layouts of struct bio and struct buf aren't part of the KBI? If so, I'd advocate for keeping BUF_TRACKING completely disabled in GENERIC: it's quite a bit less useful than FULL_BUF_TRACKING in my experience anyway.

I'd be okay with moving it from the 'always on' section to the 'disable on stable branches' section (and also adding nooptions BUF_TRACKING to GENERIC-NODEBUG). Is that ok?

I think it's ok in the sense that it address the issue I pointed out, since out-of-tree modules can't depend on HEAD's KBI anyway. I'm generally just not a fan of having debug features enabled by default unless they have no cost or they address some particularly pressing or widespread need. (Why does the kernel running my wifi router have MALLOC_DEBUG_MAXZONES=8?) But I don't object to your proposal.

sys/cam/cam_xpt.c
5199 ↗(On Diff #21745)

I meant biotrack() in the !defined(BUF_TRACKING) && !defined(FULL_BUF_TRACKING) case, but I see the problem.

In D8366#174220, @markj wrote:
In D8366#174214, @cem wrote:

I'd be okay with moving it from the 'always on' section to the 'disable on stable branches' section (and also adding nooptions BUF_TRACKING to GENERIC-NODEBUG). Is that ok?

I think it's ok in the sense that it address the issue I pointed out, since out-of-tree modules can't depend on HEAD's KBI anyway.

Right.

I'm generally just not a fan of having debug features enabled by default unless they have no cost or they address some particularly pressing or widespread need.

HEAD GENERIC has INVARIANTS and WITNESS on. This is far less costly than those options.

(Why does the kernel running my wifi router have MALLOC_DEBUG_MAXZONES=8?) But I don't object to your proposal.

Is that (or could that be) disabled in GENERIC-NODEBUG? I think that's the way to go for non-development HEAD machines. (Also, unless your wifi router is amd64, this patch doesn't touch it. :-))

In D8366#174224, @cem wrote:
In D8366#174220, @markj wrote:
In D8366#174214, @cem wrote:

I'd be okay with moving it from the 'always on' section to the 'disable on stable branches' section (and also adding nooptions BUF_TRACKING to GENERIC-NODEBUG). Is that ok?

I think it's ok in the sense that it address the issue I pointed out, since out-of-tree modules can't depend on HEAD's KBI anyway.

Right.

I'm generally just not a fan of having debug features enabled by default unless they have no cost or they address some particularly pressing or widespread need.

HEAD GENERIC has INVARIANTS and WITNESS on. This is far less costly than those options.

That's a fair point. Note though that even when WITNESS is disabled, struct lock_object contains the unused lo_witness field (presumably because of the same KBI consideration that I raised). That's the kind of semi-hidden bloat that makes me angsty. :)

WITNESS is useful enough that the extra space is probably justified, but it's something that should be considered on a case-by-case basis. To be clear, I think what you proposed is reasonable.

(Why does the kernel running my wifi router have MALLOC_DEBUG_MAXZONES=8?) But I don't object to your proposal.

Is that (or could that be) disabled in GENERIC-NODEBUG? I think that's the way to go for non-development HEAD machines. (Also, unless your wifi router is amd64, this patch doesn't touch it. :-))

I think that might be the way to go. Really, I'd make the default 1, set it to 8 or whatever in various GENERIC configs, and override it again in GENERIC-NODEBUG. That'll do what I want on arches that have lots of specialized kernel configs. I was just raising this as an offhand example though.

cem marked 8 inline comments as done.
cem edited edge metadata.
  • Fix style issues.
  • Turn off BUF_TRACKING in NODEBUG and suggest disabling for stable branch GENERIC.
  • Allow FULL_BUF_TRACKING to be used without explicitly enabling BUF_TRACKING too.
markj edited edge metadata.

Seems ok to me.

This revision is now accepted and ready to land.Oct 31 2016, 5:50 PM
This revision was automatically updated to reflect the committed changes.