Page MenuHomeFreeBSD

dtrace: Add the 'oformat' libdtrace option.
Needs ReviewPublic

Authored by domagoj.stolfa_gmail.com on Sep 5 2023, 8:36 PM.
Tags
Referenced Files
Unknown Object (File)
Thu, Nov 23, 9:18 AM
Unknown Object (File)
Nov 7 2023, 6:27 AM
Unknown Object (File)
Nov 4 2023, 2:14 AM
Unknown Object (File)
Oct 26 2023, 10:55 PM
Unknown Object (File)
Oct 19 2023, 1:30 PM
Unknown Object (File)
Oct 19 2023, 1:30 PM
Unknown Object (File)
Oct 19 2023, 1:30 PM
Unknown Object (File)
Oct 18 2023, 12:47 AM

Details

Reviewers
markj
gnn
Group Reviewers
DTrace
Summary

This option can be used to specify a format to use in DTrace output.
The following formats are supported:

  • json
  • xml
  • html
  • none (default DTrace output)

This is implemented using libxo and integrated into libdtrace. Client
code only works with the following API:

  • dtrace_oformat_setup(dtrace_hdl_t *) -- to be called when output is

starting.

  • dtrace_oformat_teardown(dtrace_hdl_t *) -- to be called when output

is finished

  • dtrace_oformat(dtrace_hdl_t *) -- check if oformat is enabled.
  • dtrace_set_outfp(FILE *) -- sets the output file for oformat.
  • Ensure that oformat is correctly checked in the drop handler and

record processing callbacks.

This commit also adds tests for oformat which only really check that the
output generated is valid JSON and XML.

The format is as follows (JSON as example):

{
  "dtrace": {
    "probes": [
      {
        "timestamp": ...,
        "cpu": ...,
        "id": ...,
        "provider": ...,
        "module": ...,
        "function": ...,
        "name": ...,
        "output": [
           ... (script specific)
        ]
      }
    ]
  }
}

or the XML/HTML equivalent.

Moreover, client code callbacks are always guaranteed to be called in
the "probes" array, including in the drop handler, so the caller may
wish to add its own data there. In this commit, dtrace(1) does not make
use of this feature.

oformat is intended as a supplementary mechanism to add machine-readable
output to DTrace and it is not intended to replace libdtrace as a
library that one can link against directly and process the data that
way. It aims to be a middle ground between manually parsing
pretty-printed output that dtrace(1) outputs and linking against
libdtrace in order to implement a new frontend.

Furthermore, streaming output should be possible but hasn't been tested
at this point. oformat guarantees that at any given point when a probe
fires and output is produced via a flush, it will be in the context of
the "probes" array, so adding "]}}" after the last received probe will
always produce valid JSON. Similarly, any incremental XML parser should
work.

There is an example [1] that can be viewed (written in Python 3) on how
one might use oformat. This file really isn't meant to be "good code" or
pretty, it is simply a quick and dirty script that demonstrates how one
might go about interfacing with the oformat output.

[1]: https://reviews.freebsd.org/P610

This is a first working version of the implementation and subject to change, but I believe that the main design goals are here.

Another design point which might be worth pondering is using the "probes" array for drops. Right now, all of the drops are formatted as a probe that doesn't exist ("dtrace:INTERNAL:INTERNAL:DROP") because this makes it easy to provide drop statistics while the probes are firing and still produce valid JSON/XML throughout the whole run.

The inspiration for this is taken from the CADETS project, however the implementation is different as it has different goals.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
domagoj.stolfa_gmail.com edited the summary of this revision. (Show Details)

Perhaps its own man page is warranted in order to avoid polluting dtrace(1)?

I don't see a reason to avoid adding it to dtrace.1?

How do aggregations get handled? Presumably it doesn't make a lot of sense to specify structured output if aggregations are in use.

Did you post about this to -hackers or -dtrace mailing lists?

cddl/contrib/opensolaris/cmd/dtrace/dtrace.1
656

We should xref libxo somewhere here.

cddl/contrib/opensolaris/cmd/dtrace/dtrace.c
2020

I think this makes sense. In retrospect it might have been a good idea for SIGINFO prints to print to stderr instead of stdout. That would also interact more nicely with the case where dtrace output is going to a file, i.e., -o is specified.

cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c
2143

I rather dislike the code duplication between dt_print_sym() and dt_format_sym(). Why can't both cases be handled with a single function?

cddl/contrib/opensolaris/lib/libdtrace/common/dt_handle.c
424

In general libdtrace doesn't report errors this way. I don't think you really need to check for errors at all.

cddl/contrib/opensolaris/lib/libdtrace/common/dt_oformat.h
6

Please don't add "All rights reserved" to new files.

I don't see a reason to avoid adding it to dtrace.1?

Fair enough, I'll put it there then. I wanted to document the entire format for every single action, which can be a bit lengthy.

How do aggregations get handled? Presumably it doesn't make a lot of sense to specify structured output if aggregations are in use.

See example here: https://reviews.freebsd.org/P610. Aggregations make sense to output in a structured way because of a few reasons:
(1) One might wish to printa(...) periodically and clear(...) and consume the output in a structured way from the program just like any other structured output. Dumping unstructured output in the middle of structured output would likely be problematic + there is no "good" way to iterate over aggregations if there is no structured output;
(2) Aggregations can also contain keys like stack(), and in order to create structured output for stack traces as a part of an aggregation, one also has to structure the aggregation;
(3) It remains consistent with the rest of the structured output and the caller can easily identify when aggregation data follows from agg_data being present in the output object.

One somewhat debatable design point is that I require an explicit printa(...) for the aggregation to be output in a structured way. That is to say, without an END probe that explicitly calls printa() there will be no aggregation output once Ctrl-C is hit. However, my rationale for this is twofold:

  • when writing a program that consumes structured output from DTrace, an additional, unexpected dump of an aggregation once the process stops is fairly annoying from my experimentation. I don't want to get output when I don't ask for it (at least in my use-cases with HyperTrace)
  • the way dtrace/libdtrace handle output upon Ctrl-C does not come from a probe, it comes from a hard-coded code path in dtrace(1) right here: https://github.com/freebsd/freebsd-src/blob/main/cddl/contrib/opensolaris/cmd/dtrace/dtrace.c#L2018-L2022. This means we would have to create a pseudo-probe for this kind of output as well in order to stay consistent.

This clearly breaks the dtrace(1) expectation, but comes with consistency of only getting output when asking for it in the case of structured output.

A sample output of dtrace -x oformat=json -n 'syscall:::entry { @ = count(); } END { printa(@); }'

will produce the following:

{
  "dtrace": {                                                                                                                                                                                               [0/182]
    "probes": [                                                                                          
      {                                                                                                  
        "timestamp": 1655264792361280,                                                                   
        "cpu": 1,    
        "id": 2,                                                                                         
        "provider": "dtrace",     
        "module": "",
        "function": "",
        "name": "END",
        "output": [
          {                                         
            "agg_data": [
              { 
                "keys": [                           
                  {  
                    "value": "sshd"
                  }   
                ], 
                "count": 17
              },         
              {
                "keys": [
                  {
                    "value": "tmux"
                  }
                ],
                "count": 30
              },
              {
                "keys": [
                  {
                    "value": "sudo"
                  }
                ],
                "count": 36
              },
              {
                "keys": [
                  {
                    "value": "dtrace"
                  }
                ],
                "count": 71
              } 
            ], 
            "agg_name": "_"
          }        
        ]                                           
      }            
    ]             
  }                                                 
}

If you have any thoughts on the way aggregations are handled, I would be happy to discuss it further as I plan to make heavy use of this code in the upcoming HyperTrace experiments so getting it right is important :).

Did you post about this to -hackers or -dtrace mailing lists?

No, I probably should and will aim to do so soon.

Thanks for the quick feedback -- I'll aim to address the comments on the code ASAP.

Address some comments by @markj. The man page comment is still true, as I will be updating that when all the documentation comes in.

domagoj.stolfa_gmail.com marked an inline comment as done.

Attempt #2 at addressing @markj's comments... Forgot one gettimeofday

I suggest to think about adding --libxo foo as an alias to -x oformat=foo to make things more inline with what is actually done for, e.g., netstat(1).

I suggest to think about adding --libxo foo as an alias to -x oformat=foo to make things more inline with what is actually done for, e.g., netstat(1).

Thanks for the feedback! I don't take a strong view on whether or not xo_parse_args() should be called at the start, though it might require some finessing to make it work happily with libdtrace because xo_parse_args() sets a number of flags for libxo that I don't with -x oformat (such as colored output in some cases). Another thing worth pondering is that this is also an argument to libdtrace rather than dtrace(1), so one would inevitably need to still call dtrace_setopt(). Now, however libxo configuration would be performed twice: once in dtrace(1), and once in libdtrace. dtrace(1) itself would explicitly need to include libxo and link against it, but this isn't a big issue IMO, it just "feels" cleaner to me for it to only link against libdtrace and for libdtrace to provide all the libxo handling and configuration and for any consumer code (including dtrace(1)) to simply request it.

However I do agree with the general point that there is an inconsistency with the rest of the base system when it comes to libxo integration (though it's inconsistent in a fair bit of other ways too). One way to work around the issue without xo_parse_args() would be perhaps to add a - flag and manually parse out the -libxo ... that follows? I'm not sure what the best approach would be that doesn't get slightly messy.

Please address the comments from markj@ but otherwise this looks good, and more extensive than what we came up with on CADETS.

This revision is now accepted and ready to land.Sep 7 2023, 1:10 PM

Updated the diff with documentation in the dtrace(1) man page, as well as a bug fix when it comes to naming aggregations. Namely, min, max, sum and count were all called count in the final output due to missing checks. These are now addressed.

This revision now requires review to proceed.Sep 8 2023, 12:16 AM

Add missing information to the man page and fix some bits in it (e.g. using .Fn for action names).

I suggest to think about adding --libxo foo as an alias to -x oformat=foo to make things more inline with what is actually done for, e.g., netstat(1).

Thanks for the feedback! I don't take a strong view on whether or not xo_parse_args() should be called at the start, though it might require some finessing to make it work happily with libdtrace because xo_parse_args() sets a number of flags for libxo that I don't with -x oformat (such as colored output in some cases).

One thing that I miss from --libxo is the pretty option, which makes the output a bit more human-readable. It's useful to be able to manually inspect the output of a script when trying to write code which consumes that output, but I don't see any way to add that option. I don't feel strongly about whether --libxo should be handled or not, but it ought to be possible to specify additional libxo options. @phil might have some guidance here (or general review comments :)).

I don't see a reason to avoid adding it to dtrace.1?

Fair enough, I'll put it there then. I wanted to document the entire format for every single action, which can be a bit lengthy.

I think what you could do is make a new section in the man page (.Sh OUTPUT FORMATTING for instance) and document the option there. In the description of -x oformat, just point the reader to that section.

One somewhat debatable design point is that I require an explicit printa(...) for the aggregation to be output in a structured way. That is to say, without an END probe that explicitly calls printa() there will be no aggregation output once Ctrl-C is hit.

Is this mentioned somewhere in the man page? I couldn't find it anywhere, but it definitely should be documented.

cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c
2139

Why is the flush call needed in this case but not the regular one?

cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c
804

Let's please call this dt_format_quantize_prepare() to make it a bit clearer that this is a helper for dt_format_quantize(). Same for lquantize.

2603

Why not pass fmt as a string literal?

cddl/contrib/opensolaris/lib/libdtrace/common/dt_print.c
301

Just picking an example at random: as I understand it the intent behind libxo is to replace printf()s with xo_emit()s. I believe libxo will DTRT, so I don't see any purpose in keeping the fprintf() calls around in cases like this. Combining them will shrink the diff a fair bit.

cddl/lib/libdtrace/Makefile.depend
16 ↗(On Diff #127095)

Did you regenerate this file, or did you add it manually?

I suggest to think about adding --libxo foo as an alias to -x oformat=foo to make things more inline with what is actually done for, e.g., netstat(1).

Thanks for the feedback! I don't take a strong view on whether or not xo_parse_args() should be called at the start, though it might require some finessing to make it work happily with libdtrace because xo_parse_args() sets a number of flags for libxo that I don't with -x oformat (such as colored output in some cases). Another thing worth pondering is that this is also an argument to libdtrace rather than dtrace(1), so one would inevitably need to still call dtrace_setopt(). Now, however libxo configuration would be performed twice: once in dtrace(1), and once in libdtrace. dtrace(1) itself would explicitly need to include libxo and link against it, but this isn't a big issue IMO, it just "feels" cleaner to me for it to only link against libdtrace and for libdtrace to provide all the libxo handling and configuration and for any consumer code (including dtrace(1)) to simply request it.

However I do agree with the general point that there is an inconsistency with the rest of the base system when it comes to libxo integration (though it's inconsistent in a fair bit of other ways too). One way to work around the issue without xo_parse_args() would be perhaps to add a - flag and manually parse out the -libxo ... that follows? I'm not sure what the best approach would be that doesn't get slightly messy.

Yeah, I think it'll be best to call xo_parse_args and xo_set_options to allow the user access to the full set of libxo options.

Thanks,
Phil

cddl/contrib/opensolaris/cmd/dtrace/dtrace.1
653

Would you consider changing "none" to "text"? This will keep the terminology in sync with libxo.

And it allow you to pass the string directly to libxo via xo_set_options(), which means you can say:

-x oformat=json,pretty

Assuming that you are wanting all output from dtrace to be in this format, calling xo_parse_args and letting it handle the arguments will be more in line with the other commands that support libxo, so that would be my preference.

On the other you'll want to support "-o file", which should leave the normal terminal output in text and only change the output format for the output file.

I'd suggest perhaps you do want both, where "-x oformat-XXX" and --libxo XXX are identical unless the -o flag is passed, in which case you want "-x" to affect the output file and "--libxo" options to affect the terminal. Would that make sense?

656

I have a new "libxo(7)" documentation that will contain user-oriented documentation:

https://github.com/Juniper/libxo/blob/develop/libxo/libxo.7

This will appear on my new import.

cddl/contrib/opensolaris/cmd/dtrace/dtrace.c
2020

Is this true even with "-o"?

2066–2075

Can there be a new command line option to make this happen, so one-liners can work?

cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c
481–482

Are you supporting "-o"? If so, do you need to use an explicit handle (and xo_open_instance_h, etc)? Or does the "-o" file get dup2'd to stdout, so this isn't needed?

cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c
1033–1039

Why is base an "int", rather than unsigned?

1079–1080

What happens to these with libxo?

1352–1356

If you're going full-on libxo, these output-generating lines will need to be converted.

This particular line will need some thought, since you'd need to snprintf the value, but without the "<", then do something like:

xo_emit("{P:/*s}{:direction}{P: }{:value}", 14 - strlen(c), "<", c);

or something like that.

1590–1593

Use full words for tag names. "symbol" instead of "sym". From xo_format(5):

Use full words
  Do not abbreviate especially when the abbreviation is not obvious or not
  widely used.  Use "data-size", not "dsz" or "dsize".  Use "interface" in‐
  stead of "ifname", "if-name", "iface", "if", or "intf".

The goal is to help tag names be as consistent as possible across all FreeBSD.

1661–1664

I so want to suggest that it make sense to pull the module and offset into distinct tags, but they really make no sense individually, right? Using multiple tags keeps the API user from having to carve the field into the three pieces. Perhaps emitting both styles, like:

<symbol>foo`goo+0x55</symbol>
<module>foo</module>
<name>goo</name>
<offset>0x55</offset>

Overkill?

2233–2234

Consider using "snprintf(t, sizeof(t) ..." just so the next developer doesn't have to think about the use of sprintf.

2603

Assuming it just needs the variable field name. I'm adding a function xo_emit_field() that will support this:

 xo_ssize_t
 xo_emit_field(const char *rolmod, const char *content, const char *fmt,
     const char *efmt, ...);

xo_emit_field("V", pd->dtpa_keyname, "%lld", "%lld",  (long long)*((uint64_t *)addr) / normal);

Again, that will be in the next import of libxo.

3503–3506

Use full words and also use dashes instead of underscores. Users can use the "underscores" option if they need to turn dashes into underscores.

Use hyphens, not underscores
  Use of hyphens is traditional in XML, and the XOF_UNDERSCORES flag can be
  used to generate underscores in JSON, if desired.  But the raw field name
  should use hyphens.
3564–3565

xo_emit returns a suitable error here:

RETURN CODE
   xo_emit returns a negative value on error.  If the XOF_COLUMNS flag has
   been turned on for the specific handle using xo_set_flags(3), then the
   number of display columns consumed by the output will be returned.

Since your test is "(n < 0)", you should be able to just say "n = xo_emit(...)".

cddl/contrib/opensolaris/lib/libdtrace/common/dt_options.c
740–742

Will you be supporting the "csv" encoder? This is another reason to let xo_set_options handle your option processing for you: new encoders are automatically supported.

% df --libxo @csv
name,total-blocks,used-blocks,available-blocks,used-percent,mounted-on
zroot/ROOT/default,3825984331,29376725,3796607605,1,/
devfs,1,1,0,100,/dev
zroot/usr/home,3808301289,11693684,3796607605,0,/usr/home
zroot/var/audit,3796607806,201,3796607605,0,/var/audit
...
cddl/contrib/opensolaris/lib/libdtrace/common/dt_print.c
238–241

This isn't a "value"; it should be an error or something. Maybe something like:

xo_emit("{:message}", "unknown encoding");
275–280

The quotes shouldn't be part of the value; they can be outside it:

xo_emit("'{:value/%c}'", c);
301

The goal behind libxo is to have a single code path, with as little overhead as possible for text output while allowing these other formats to be emitted, so yes, it makes sense to drop the fprintf calls and move them to libxo.

542–544

This also isn't a value, but an error message. Also when we use of JSON or XML, we really want to avoid using regex or pattern matching to extract data from a field. Here someone wanting the size would have to work to get it. Instead consider:

xo_emit("{<:message} {:size/%u}>",  "invalid enum size", (uint_t) size);
cddl/contrib/opensolaris/lib/libdtrace/common/dt_subr.c
1003–1009

Here you're assigning the output file for the default handle. Another approach would be to make a new handle and use that explicitly. This is certainly a fine choice, but want to make you aware of the alternative and the impact of using the same handle. If you go this way, I'd definitely suggest calling xo_parse_args so all the normal behaviors are available using the normal --libxo options. Also means you can use xo_set_options as I mentioned above.

Thanks for the detailed reviews @markj and @phil! The libxo clarifications have been very helpful, much appreciated. I've added a few comments in places that can be discussed independently and will aim to update the diff addressing the comments sometime this week.

cddl/contrib/opensolaris/cmd/dtrace/dtrace.1
653

I will make the change from "none" to "text" and use xo_set_options(), this should clean the code up quite a bit.

As for --libxo, I've been trying to "hide" libxo from dtrace(1) itself because all the changes are essentially in libdtrace so dtrace(1) itself doesn't really care what formatting library is being used, it just cares that libdtrace is somehow formatting the data in a structured way. I'm not really sure if this is the best approach, but I do feel that attempting to decouple dtrace(1) which is a DTrace consumer and libdtrace itself with new features is worthwhile in order to encourage people to build other consumers rather than just invoking dtrace(1) for convenience sake.

WDYT @markj, @phil?

cddl/contrib/opensolaris/cmd/dtrace/dtrace.c
2020

When -o is specified it'll go to the file because dtrace_aggregate_print is called with g_ofp which will have been changed to the file that we redirected to. Perhaps this is a separate change that could be made unless something else depends on this behavior?

2066–2075

Does -O sound reasonable?

cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c
481–482

I've considered using xo_open_instance_h and the likes, however since all of the output would need to be redirected anyway I've decided to just defer it to libxo using xo_set_file to minimize book-keeping in dtrace_hdl. Do you think there would be any benefit to doing this explicitly from a libxo perspective? I can't really think of a use-case where I would want to partition the output in any way in libdtrace as it stands.

2139

When we don't use libxo, we don't really care when the flush happens because it's not really meant to be read by a machine in a structured way. However, an explicit flush in the case of libxo should allow one to reliably read aggregation data from a pipe if running under something like Python, so that the data can be processed while DTrace is still running. At least that's the idea :). Do you think we should call xo_flush() in both cases?

cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c
1033–1039

This is how libdtrace represented it before, so I just kept it the same as the original code. It added some duplication unfortunately, but it's consistent with the existing libdtrace code.

1079–1080

This function never gets called if oformat is specified, so it should never end up in the output.

1352–1356

Similar to the other one, this doesn't run when oformat is specified so shouldn't end up in the output anywhere. If it does, it's certainly a bug I did not anticipate.

1661–1664

I'm not sure. I don't have a strong opinion on this because all of my stack trace usage only ever has to use them combined, so I didn't do this. If someone could point out where this makes sense I'm happy to split it into multiple entries.

2603

@phil's comment is correct, this was done because I needed a variable field name. The xo_emit_field function seems neat, thanks! I believe that this is the only function that would make use of it, so it shouldn't be too hard to change if the timing turns out to be off.

cddl/contrib/opensolaris/lib/libdtrace/common/dt_options.c
740–742

That's a good point, thanks. I'll change this code to make use of xo_set_options and largely defer the handling of it to libxo. I think I still need to keep the plaintext define around for some bits around libdtrace, but a lot of the other stuff can be cleaned up that way.

cddl/contrib/opensolaris/lib/libdtrace/common/dt_print.c
238–241

I assume this would make sense to also conflate with the fprintf below (same with all the other examples of this pattern)?

cddl/lib/libdtrace/Makefile.depend
16 ↗(On Diff #127095)

Thanks for pointing this out. I did it manually and will address it on the upcoming diff.

cddl/contrib/opensolaris/cmd/dtrace/dtrace.1
653

I can see trying to limit the deep knowledge of libxo from the code, but I want to be able to maximize the benefits of moving to libxo. It's a balance.

And the "--libxo" argument is a bit ugly; I did it just to avoid breaking anything while giving consistent behavior. So having the same behavior under "-x oformat=XXX" makes some sense, but I'd support --libxo for consistency.

cddl/contrib/opensolaris/cmd/dtrace/dtrace.c
2066–2075

Perfect!

cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c
1079–1080

But since the goal is "one code path", turning these into xo_emit calls makes sense. Also this will allow these headers to show up in "html" style:

    if (xo_emit("\n{T:/%16s} {T:/%41s} {T:%-9s}\n", "value",
	    "------------- Distribution -------------", "count") < 0)

This is assuming that "fp" points to the same thing that the default handle point to.

Or maybe I'm missing something: are you converting all dtrace output or just a subset?

1661–1664

I'm not strongly opinionated on this one either, but it' hard to know what your users will need. I can imagine wanting to filter on a particular module or function, but that's just my random imagination.....

cddl/contrib/opensolaris/lib/libdtrace/common/dt_options.c
740–742

And after calling xo_set_options, you can do something like:

is_plain_text = (xo_get_style(NULL) == XO_STYLE_TEXT);

to set your own flag.

cddl/contrib/opensolaris/lib/libdtrace/common/dt_print.c
238–241

Yup, like the other chuck I sent with the "<" and ">":

xo_emit("<{:message}>", "unknown encoding");

Would "{:warning}" be more clear that something's corrupted?

cddl/contrib/opensolaris/cmd/dtrace/dtrace.1
653

That sounds good to me. I don't take a strong view one way or the other, so I'll add it for consistency in the next diff.

cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c
1079–1080

I think the original reason I did this was because dt_printf is fairly involved in what it does: https://github.com/freebsd/freebsd-src/blob/main/cddl/contrib/opensolaris/lib/libdtrace/common/dt_subr.c#L561-L578.

The dtrace_sprintf() scenario is not an issue because the only public function that can possibly allocate the the sprintf buffer for DTrace is the dtrace_sprintf() function. As a result, this will never happen from this code path. However, passing the fp as NULL is certainly possible from the consumer end. fp originates from a call to dtrace_consume(). For example, lockstat calls it directly: https://github.com/freebsd/freebsd-src/blob/main/cddl/contrib/opensolaris/cmd/lockstat/lockstat.c#L1090.

This unfortunately complicates things a fair bit, so I just opted for a completely separate code path. Perhaps my libxo-foo is lacking and I don't see an obvious way to address this with a single code-path without essentially doing an

if (dtp->dt_oformat) {
    xo_emit(...);
} else {
    dt_printf(...);
}
1661–1664

I could add both formats -- the "legacy" (perhaps calling it "frame") which is how DTrace currently represents stack frames and the machine-readable one in the format that you've suggested. Does that sound like a reasonable approach?

Update the diff. This diff should address the following:

  • Leave Makefile.depend alone.
  • Change "none" to "text" and use xo_set_options() + xo_parse_args() to support --libxo and additional options. oformat options take precedence over the ones specified with --libxo.
  • Add a new command-line option, -O, which tells dtrace(1) to print aggregations upon exit if structured output is being used.
  • Use full words and - instead of _ in key names.
  • Add more structured output to stack(), ustack() and sym().
  • Do not represent warnings as "value" and instead use "warning" as the key name and make it more structured.
  • Conflate xo_emit() and plain fprintf() calls.
  • Update the man page.
cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c
1079–1080

Yeah, that makes sense.

You could put an xo_handle_t into dtrace_hdl_t and use a FILE * in an xo_handle_t, and I could make a set of xo_set_writer()-friendly write/flush/close functions to that write to a buffer, but then we'd have to look at what's using the dt_sprintf_buf value and what it's doing with it. Not sure where that string goes and what encoding it would want.

cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c
1079–1080

The buffer in question seems to be dt_buffered_buf, which is not really mentioned anywhere throughout the entire codebase other than in dt_printf(). It is somewhat tempting to just remove this code path all together and replace it with xo_emit(), but I'm not sure if anyone is using it either out-of-tree or in some other implementation (macOS, Windows?). We certainly don't lose anything by ignoring it in FreeBSD for the current diff and keep the original behavior for regular non-libxo output.

As for the new libxo functions, I don't know if it would be commonly used outside of this? I'm also unsure if it's worth adding to support this use-case alone, considering that libdtrace doesn't expose dt_buffered_buf publicly nor provides any meaningful function to interact with it. The only way this could be used is if someone has their own version of libdtrace with functions that use it -- which is certainly possible and I'd prefer to avoid breaking (at least for this patch).

@markj Do you have any insight as to what might make sense to do here?

cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c
1079–1080

I was thinking about:

https://github.com/freebsd/freebsd-src/blob/main/cddl/contrib/opensolaris/lib/libdtrace/common/dt_subr.c#L599

So digging in the source, I see that this is really only used by dt_sprintf(), which is only called by dtrace_freopen() and dtrace_system(), and I'm sure neither of those want any sort of libxo formatting.

I think as long as you leave the existing dt_printf suitable for dtrace_sprintf calls, you can freely move the rest of the dt_printf calls to either:

  • xo_emit(...) using the default handle which if either stdout or xo_set_file(NULL, some_fp)
  • xo_emit_h(dtp->dt_xop, ...) using a handle that's either NULL or one from xo_create_to_file()
  • create a new dt_emit(dtp, ...) function that hides xo_emit* calls and the handle and all the gritty current-and-future details

I'd probably lean toward the last. So instead of if/else or two distinct "print average" functions, you just have a single call that does The Right Thing, like:

static int
dt_print_average(dtrace_hdl_t *dtp, FILE *fp, caddr_t addr,
    size_t size, uint64_t normal)
{
    /* LINTED - alignment */
    int64_t *data = (int64_t *)addr;

   return xo_emit(" {:average/%16lld}",
             data[0] ? (long long)(data[1] / (int64_t)normal / data[0]) : 0);
 }

Note that since libxo automatically removes widths from formats,you don't have to give an explicit "encoding" format string (e.g. "{:average/%16ldd/%ldd}". The default encoding format string is the normal one widths removed.

cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c
1079–1080

The problem wasn't with dtrace_sprintf(), rather the scenario where NULL was passed as the FILE *. Specifically, this code: https://github.com/freebsd/freebsd-src/blob/main/cddl/contrib/opensolaris/lib/libdtrace/common/dt_subr.c#L619-L693.

However, your suggestion would work regardless because dt_emit() could just add this particular check. I'll add dt_emit() in the next diff, keeping around dt_printf() for the other things that use it and then switch to dt_emit() in these cases. Thanks!

cddl/contrib/opensolaris/lib/libdtrace/common/dt_consume.c
1079–1080

Well, assuming this is just straight-forward buffering, libxo contains its own buffering facility.

Hmm... looks like the dt buffering grows unlimited, where xo just tries to make full BUFSIZ contents to reduce i/o overhead. (That's really only mostly true, since we can buffer more, depending on the particular features being used.)

I don't really know if this is or isn't important; the buffered flush doesn't seem to be that wide spread:

% grep -r dt_buffered_flush .
./common/dt_printf.c:                           if (dt_buffered_flush(dtp, NULL, NULL,
./common/dt_printf.c:                   if (dt_buffered_flush(dtp, NULL,
./common/dt_impl.h:extern int dt_buffered_flush(dtrace_hdl_t *, dtrace_probedata_t *,
./common/dt_subr.c:dt_buffered_flush(dtrace_hdl_t *dtp, dtrace_probedata_t *pdata,
./common/dt_consume.c:          if (dt_buffered_flush(dtp, NULL, rec, aggdata,
./common/dt_consume.c:          if (dt_buffered_flush(dtp, NULL, rec, aggdata,
./common/dt_consume.c:  if (dt_buffered_flush(dtp, NULL, NULL, aggdata,
./common/dt_consume.c:                  if (dt_buffered_flush(dtp, &data, rec, NULL, 0) < 0)

but this just a quick look. Ah, it's used for "dtrace -l".

Like you said, these details can hide inside dt_emit. And you could make xo_set_writer() functions that copy data into dt_buffered_buf, at the cost of another copy.

cddl/contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c
2139

So this fixes an actual problem you encountered? I don't quite understand what you mean by "reliably".

The fact that we flush in one case but not the other just looks odd, I don't have an opinion on whether we should flush or not. If it's necessary to flush in one case but not the other, a comment should explain why.

@phil I've tried to do a basic implementation using xo_create_to_file and then implementing dt_emit() by redirecting buffered output and sprintf output to the regular printf output (it becomes dt_vprintf since i have a va_list instead of variadic arguments). However, I've noticed that if I do regular text output using

	n = xo_emit_hv(dtp->dt_xop, format, ap2);

the expected text output changes:

Regular DTrace with libxo:

 $ sudo dtrace -n 'BEGIN { mod(0xffffffff82317000); exit(0); }'
dtrace: description 'BEGIN ' matched 1 probe
CPU     ID                    FUNCTION:NAME
  7      1                           :BEGIN   opensolaris.ko

Using xo_emit_hv():

 $ sudo dtrace -n 'BEGIN { mod(0xffffffff8261a000); exit(0); }'
dtrace: description 'BEGIN ' matched 1 probe
CPU     ID                    FUNCTION:NAME
  6      1                           :BEGIN

  opensolaris.ko

and I can't seem to figure out a way to make this happy. Furthermore, there are certain parts of the non-structured output that take a format specifier in the function. While many do not make use of it anywhere in libdtrace, others do, and therefore would not be eligible for dt_emit() that easily without extensive testing: https://github.com/freebsd/freebsd-src/blob/main/cddl/contrib/opensolaris/lib/libdtrace/common/dt_printf.c#L371 where format gets populated here: https://github.com/freebsd/freebsd-src/blob/main/cddl/contrib/opensolaris/lib/libdtrace/common/dt_printf.c#L1357.

I feel like going further down this route would be fairly error-prone and I'm not sure if we currently have coverage in the test suite to detect regressions for everything. Do you have any insight as to how one could go about using libxo sensibly here? I'm somewhat tempted to keep it separate due to the seeming lack of ability to behave in the same way that regular fprintf output does in libdtrace for now (it's not really clear to me how to make it happen) and perhaps add features + conflate the outputs over time as libxo versions get pulled in. Do you think such an approach would be sensible?

Thanks!

$ sudo dtrace -n 'BEGIN { mod(0xffffffff8261a000); exit(0); }'
dtrace: description 'BEGIN ' matched 1 probe
CPU ID FUNCTION:NAME

6      1                           :BEGIN

opensolaris.ko

That's pretty terrible, given that all flavors of xo_emit* turn into the same codepath (xo_do_emit). Can you isolate this into something I can debug, or send me your current patch (phil@freebsd.org) and I'll see if I can do it?

I feel like going further down this route would be fairly error-prone and I'm not sure if we currently have coverage in the test suite to detect regressions for everything. Do you have any insight as to how one could go about using libxo sensibly here? I'm somewhat tempted to keep it separate due to the seeming lack of ability to behave in the same way that regular fprintf output does in libdtrace for now (it's not really clear to me how to make it happen) and perhaps add features + conflate the outputs over time as libxo versions get pulled in. Do you think such an approach would be sensible?

I can appreciate your position, and while "one code path" is best, it might not be possible here. That said, I'd like a chance to debug this and see why they aren't consistent, even if you decide to take another path.

Thanks,
Phil

That's pretty terrible, given that all flavors of xo_emit* turn into the same codepath (xo_do_emit). Can you isolate this into something I can debug, or send me your current patch (phil@freebsd.org) and I'll see if I can do it?

Sure thing, I'll shave off all the extra bits and email it to you (e.g. only mod will have the do_emit() changes). Thanks!