Add dwatch(8) for watching processes as they trigger dtrace probe
Needs ReviewPublic

Authored by dteske on Mar 14 2017, 7:59 PM.

Details

Reviewers
markj
gnn
skreuzer
avg
Group Reviewers
manpages
DTrace
Summary

Introduce dwatch(8) as a tool for making dtrace more useful

Test Plan

Try all the flags. Review the manual. Try each module.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 12771
Build 13039: arc lint + arc unit
There are a very large number of changes, so older changes are hidden. Show Older Changes
asomers added a subscriber: gibbs.Mar 15 2017, 11:56 PM

Adding Justin Gibbs, who has expressed opinions about placing code in cddl. @gibbs, dteske's submission is BSD licensed but depends on the dtrace stuff, all of which is CDDL. Do you think he should put it into usr.sbin or cddl/usr.sbin?

A few observations that may help answer the Q of "usr.sbin vs cddl/usr.sbin"...

+ zfsd lives in cddl/usr.sbin and is BSD licensed
+ not one single program in any of bin, sbin, usr.bin, or usr.sbin references dtrace in any way

Which leads me to believe that there is really only one answer...

cddl/usr.sbin looks to be the right answer.

Will wait 48 hours for feedback before updating patch again to prevent ping-ponging back and forth.

I've only just started playing with it, but I'll point out that the usage seems misleading: dwatch doesn't really take a (dtrace) probe as input, it takes a function or syscall name and constructs a probe name. The first thing I tried (without having fully read the manpage) was "dwatch sched:::on-cpu", which of course gives an error.

I don't see any non-trivial implementations for ACTIONS. It's not really clear to me how it might be used.

usr.sbin/dwatch/dwatch.8
162 โ†—(On Diff #26295)

"to nanosleep(2)"

I'll point out that the usage seems misleading: dwatch doesn't really take a (dtrace) probe as input,
it takes a function or syscall name and constructs a probe name.

Not quite true, as documented in the manual, a probe name is constructed only if the argument consists solely of alpha-numeric and/or underscore characters (so if the argument contains a colon for example, no change is done on the probe, but it cannot be used to select ":entry" vs ":return" vs ":on-cpu", as dwatch is only for entry points unless you write a module -- explained further below).

The first thing I tried (without having fully read the manpage) was "dwatch sched:::on-cpu", which of course gives an error.

I don't see any non-trivial implementations for ACTIONS. It's not really clear to me how it might be used.

You should always fully read the manual.

It's not for ":on-cpu" it's for ":entry"

At the very top of the manual is the synopsis above the syntax:
dwatch - Watch processes as they enter a particular DTrace probe

The word "enter" is key here.

It takes [[provider:]module:]function and you cannot select from ":entry" versus ":return" vs something else, dwatch is only for ":entry" on a [[provider:]module:]function trace point (unless you write a module -- explained below).

In the manual, explained below the list of available options is the format of the first/only non-option argument.

So while you may be able to see a trace point with the following command using your intended argument (see below dtrace selector):

dteske@nplusfreebsd ~ $ sudo dtrace -ln sched:::on-cpu

ID   PROVIDER            MODULE                          FUNCTION NAME

60885 sched kernel none on-cpu

It is explained in the dwatch manual that unqualified probes are matched against "dtrace -ln <probe>:entry", in the synopsis that dwatch is a tool for watching probe *entry* (not return, not on-cpu, not any other name for any given probe -- unless you write a module).

If you write a module, you can override the PRINT variable which defaults to "entry" and change it to something else (e.g., the "execve" option actually uses a PRINT value of "return /execname != this->caller_execname/"), so if you really need to operate on something other than entry, it is possible -- and the reason why you must write a module to do it is because that makes the logic coherent and centrally collected, as can be seen in many of the existing modules where small things have to be coherently tweaked together to produce the most informational and useful results for a particular probe.

dteske added inline comments.Mar 16 2017, 3:49 PM
usr.sbin/dwatch/dwatch.8
162 โ†—(On Diff #26295)

Ah, thanks for catching this. Will be fixed in tomorrow evening's update, slated to likely take us back to cddl/usr.sbin where it can live along-side the BSD-licensed cddl/usr.sbin/zfsd/ code

@dteske Don't forget to update etc/mtree/BSD.usr.dist

@dteske Don't forget to update etc/mtree/BSD.usr.dist

Thanks for the reminder. ๐Ÿ˜„

dteske updated this revision to Diff 26391.Mar 18 2017, 4:17 PM

Move back to cddl/usr.sbin to live amongst similarly [BSD] licensed zfsd
Update ObsoleteFiles.inc to remove deprecated/removed watch_* scripts
Fix a copy/paste issue in the manual (s/lchmod/nanosleep/)
Add usr/libexec/dwatch to etc/mtree/BSD.usr.dist

dteske marked 4 inline comments as done.Mar 18 2017, 4:19 PM
dteske updated this revision to Diff 26471.Mar 20 2017, 11:25 PM

Update patch to apply cleanly against latest HEAD

dteske updated this revision to Diff 26472.Mar 20 2017, 11:45 PM

Add `-p action' flag, example usage: dwatch -p on-cpu sched::

dteske updated this revision to Diff 26473.Mar 20 2017, 11:51 PM

Allow modules to contain hyphen in their name

dteske updated this revision to Diff 26474.Mar 20 2017, 11:57 PM

Fix mdoc warning: .It macros in lists of type `tag-list' require arguments

dteske updated this revision to Diff 26475.Mar 20 2017, 11:59 PM

When a dwatch(8) module does not define DETAILS, default to pproc_dump()

dteske updated this revision to Diff 26477.Mar 21 2017, 12:05 AM

Optimize filtration

dteske added a comment.EditedMar 21 2017, 12:09 AM

The first thing I tried (without having fully read the manpage) was "dwatch sched:::on-cpu", which of course gives an error.

I've added `-p action' to override the default of "entry". So you can do the following: dwatch -p on-cpu sched::

dteske planned changes to this revision.Mar 22 2017, 5:59 PM

I have thought of a way to massively reduce the size of the diff. In the spirit of asomers observation that several of the modules are quite similar, I've decided to improve the functionality provided to modules which will allow the modules to be linked to each other whilst preserving explicit probe value.

dteske updated this revision to Diff 26576.EditedMar 23 2017, 12:29 AM

Incorporate much feedback from previous comments on this review
Update patch to apply cleanly to head
Make lchmod a hard link to chmod
Make vop_{lookup,mkdir,mknod,remove,rmdir} a hard link to vop_create
Change terminology to name dwatch(8) modules to instead profiles
NB: Prevents term modules from being confused with dtrace(1)
Improve readability and simplify code
Move -e code option to -D code for injecting DTrace event detail code
Move -p action option to -e name for modifying the event action
Move -M option to -p for disabling profiles (formerly termed modules)
Update usage statement to be more clear about what a "probe" is
NB: Instead of "probe" show that we expect "[provider:[module:]]function"
Simplify profile loading
Set $FILE $PROFILE and $PROBE before loading profile(s)
Consolidate file descriptor operations
Fix a bug where -u root or -g root did not work as expected
NB: Also fixed -u 0 and -g 0 which suffered from the same issue
Add further clarification to the manual page with respect to probe format
Add examples to the manual page
Move examples/module_template to examples/profile_template
Improve comments in examples/profile_template
Reduce size of profiles, removing comments inherited from profile_template
Add numerical suffixes in nanosleep profile (s for seconds; ns for nanosec)
Add support for -l -e name [pattern] syntax
NB: The -l [pattern] syntax did not support -e name before

dteske added a comment.EditedMar 23 2017, 7:24 PM

I wanted to take a moment to explain why the probe syntax for dwatch(8) is "[provider:[module:]]function" and not "[provider:[module:[function:]]]name" as is the probe syntax for dtrace(1).

The dwatch(8) utility aims to simplify access to dtrace(1) and a quick tally of names available from dtrace -l shows that entry is by far the most common name, making a simplified dwatch function the ideal quick-style invocation. See code snippet below:

dteske@nplusfreebsd ~ $ sudo dtrace -l | awk 'N[$NF]++{}END{for(name in N)printf "%8u %s\n", N[name], name}' | sort -bn | awk '$1>9'
      83 start
      94 done
     120 mac-check-err
     120 mac-check-ok
     398 free
     398 malloc
   29487 return
   33643 entry

In second-place is return but it may produce unexpected results to walk the td_proc or vnode chain(s) while in a return event. All of the remaining names combined amount to less than 5% of the entry names available in dtrace -l.

dteske@nplusfreebsd ~ $ sudo dtrace -l | awk 'NR==1{next} $NF=="return"{next} $NF=="entry",++N[$NF]{next} {N["others"]++} END{for(name in N)printf "%8u %s\n", N[name], name; printf "(others * 100) / entry = %4.2f%% (entry = %4.2f%% of probes minus return)\n", p = (N["others"] * 100) / N["entry"], 100 - p}' | sort -bn
(others * 100) / entry = 4.17% (entry = 95.83% of probes minus return)
    1404 others
   33643 entry

But recognizing the need for access to the others, I have added a -e name option (formerly -p action before last update to diff) for providing access to less common names. I have also given an example of how to select something like sched:::on-cpu in the manual under the new EXAMPLES section, such as dwatch -e on-cpu sched::

But I primarily wanted to make it easiest to access the one name that takes up 95.83% of the available dtrace(1) names, so -e name defaults to a name of entry if not given -- ultimately paired up with the non-option argument that follows in the format of [provider:[module:]]function (which now replaces the word "probe" in the manual synopsis and short usage available with -h).

Cheers!

dteske removed a reviewer: gnn.Mar 27 2017, 6:21 PM
dteske added a reviewer: gnn.
dteske added a reviewer: skreuzer.
dteske added a reviewer: avg.Mar 28 2017, 4:09 AM
dteske updated this revision to Diff 26792.Mar 29 2017, 8:12 PM

Update diff to apply cleanly to HEAD
Make -q flag squelch errors from dtrace(1)

Herald added 1 blocking reviewer(s): gnn. ยท View Herald TranscriptMar 29 2017, 8:12 PM
dteske updated this revision to Diff 27563.Apr 20 2017, 12:13 AM

Add "-t test" option for customizing dtrace(1) predicate
Add "-j jail" option for limiting events to a jail name/jid
Change "-p" option (disable profiles) to "-P"
Add "-p pid" option for watching a particular process id
When given "-D-" perform all error checks before reading stdin
Issue an error and exit if group in "-g name" does not exist
Fix spurious double-error when given bad user in "-u name"
Given "-u user" and "-g group" but not "-q", display user before group
Use walltimestamp instead of timestamp for event date/time
Very minor refactoring for code readability
Make ending semi-colon for "-D code" optional
Whitespace and comments
Minor fixes to grammar and punctuation in manual
Sort examples in manual by flag being demonstrated
Add the following examples to manual:
dwatch -f '(mk|rm)dir' execve
dwatch -g wheel execve
dwatch -j 0 execve
dwatch -j myjail execve
dwatch -l 'read$'
dwatch -p 1234 execve
dwatch -q -t 'arg2<10' -D 'printf("%d",arg2)' write
dwatch -v -p 1234 execve

dteske updated this revision to Diff 27730.Apr 25 2017, 11:33 PM

Update patch to apply cleanly to HEAD
Fix generation of ellipsis for trailing arguments
Remove double-quotes on $COUNT which is guaranteed to be a number

dteske updated this revision to Diff 27768.Apr 26 2017, 10:19 PM

Update to apply cleanly to HEAD
Add "-x" option to enable probe tracing
Don't set caller_execname unless hooked on syscall::execve
Fix indentation in generated dtrace(1) code
Beautify generated dtrace(1) code
Fix nested DTrace predicates; "-t test" now works with profiles
Postpone jail predicate generation until after profile loading
Fix a missing ".Ar" in dwatch(8) manual
Update examples/profile_template to include EVENT_TEST info
Change "kill" profile to print unmodified pid_t argument
Fix signedness in printf for signal argument in "kill" profile
Add code to "vop_*" profiles to NULL-ify transient variables
Optimize vnode-walking process in VFS profiles
Fix ellipsis generation for paths exceeding DEPTH_MAX in VFS profiles
Update VFS profiles to use EVENT_TEST instead of EVENT for predicates
Fix incorrect "probe ID" comments generated by vop_rename profile

dteske updated this revision to Diff 29314.EditedJun 8 2017, 1:13 AM

Complete overhaul in an effort to address earlier feedback.

The usage and syntax and many flags have been altered.
Below is an itemized list of individual targeted changes:

+ Move manpage from section 8 to section 1 to mirror dtrace(1)
+ Updated usr mtree dist file to include /usr/share/examples/dwatch
+ Updated the man-page:

  • Title updated
  • Synopsis updated
  • Syntax options expanded to match dtrace(1) manual
  • Made to pass "mdoc -Tlint" and "igor -Dgpxy" cleanly
  • Remove extraneous .Pp before .Bl
  • Remove "the following" in various locations
  • Remove "simple" and "simply" in a couple cases
  • Move the EXAMPLES section below the ENVIRONMENT and EXIT STATUS sections
  • Fix a typo s/lchmod/nanosleep/ in PROFILES section
  • Add description before options
  • Moved options to their own section
  • Updated option descriptions
  • Add many new examples (one for each flag) and update existing

+ Syntax changes:

  • The probe format now matches that of dtrace(1)
  • Options -e, -f, -F, -l, -m, -n, -p pid, -P, -q, -v, -V, -w, and -x do the same thing compared to dtrace(1)
  • No longer any conflicting options when compared to dtrace(1) syntax
  • The -e name option argument has been deprecated
  • The -h option has been deprecated, matching dtrace(1)
  • Options -d, -g group, -j jail, -l, -p pid, -q, -t test, -u user, and -x are the same as the previous version
  • Options -g group, -p pid, and -u user can now take a regular expression
  • Options -1, -e (no argument), -f (no argument), -F, -k name, -Q, -T time, -V, -w, and -y have been added
  • Option -c count is now -N count
  • Option -D code is now -E code
  • Option -f regex is now -z regex
  • Option -m num is now -B num
  • Option -n num is now -K num
  • Options -m, -n, -P, and -v have been repurposed to match dtrace(1)
  • Loading of modules has been disabled by default (formerly you had to use -P to disable loading of modules) and -M name has been added to enable loading a module of given name

+ Added ANSI coloring to output when stdout is to a terminal
+ Added mathemagical probe selection algorithm for expanding unqualified probes

  • If a probe does not contain a colon (:) and none of the options -P (provider), -m (module), -f (function), or -n (name) are given to indicate the probe type, this algorithm uses mathematics to determine the most likely choice and expands it to a fully-qualified probe (of the format [provider]:[module]:[function]:[name]

+ Add DTRACE_PRAGMA to profile_template (in examples)
+ Update profile_template and profile PROBE settings to allow for inheritance
+ Make use of probefunc in profiles when we need to display function name
+ Add a one-line flag (-1) to prefer single-line output (vs -R for tree)
+ Add function-trace flag (-F) which works same as dtrace(1)
+ Add -P, -m, -f, and -n flags for specifying probe type
+ Add -k name option argument for making the trigger only fire when execname matches the given name, name*, *name, or *name* supported formats
+ Add -Q flag for listing querying available profiles
+ Add -T time option argument for enabling timeout
+ Add a version flag (-V) for checking the version of dwatch
+ Add -w flag for enabling destructive capabilities (enables dtrace -w)
+ Add -y flag to always enable color even if stdout is not a terminal
+ Filter dtrace(1) errors (stderr output) unless -v option is given
+ Add the ability to list unique providers vs modules vs functions vs names

  • Use -lP, -lm, -lf, or -ln respectively

+ Don't list profiles in the standard syntax usage statement (moved to -Q)
+ Allow more than one probe to be given
+ Allow options to be specified after probe argument(s)
+ Support cascading predicates

dteske added a comment.Jun 8 2017, 1:44 AM

I am running an open public beta of this software. To test the software that is currently submitted here for review, head over to pkg.fraubsd.org in your browser, pick an architecture, and either follow the instructions for setting up the FrauBSD pkg repository and say "pkg install fraubsd/dwatch" or download the dwatch tarball and install with "pkg add PKGFILENAME.txz"

dteske added a comment.Jun 8 2017, 1:51 AM

I've only just started playing with it, but I'll point out that the usage seems misleading: dwatch doesn't really take a (dtrace) probe as input, it takes a function or syscall name and constructs a probe name.

Please give dwatch another try.

I've taken what you said to-heart and completely re-architected the usage to be more like dtrace.

In fact, I've re-designed the syntax of dwatch expressly to teach people how to use dtrace and also foster good-habits when it comes to translating dwatch syntax to dtrace or vice-versa. In many ways you can now use dwatch as you would dtrace and vice-versa, using the same probe names and same probe-type indicator flags (-P, -m, -f, and -n).

The first thing I tried (without having fully read the manpage) was "dwatch sched:::on-cpu", which of course gives an error.

You should now be able to say "dwatch sched:::on-cpu" (or even "dwatch on-cpu" if you like).

I don't see any non-trivial implementations for ACTIONS.

The vop_* profiles have huge ACTIONS that are quite complex.

It's not really clear to me how it might be used.

I've added tons of new examples to the manual to help spark ideas.

As always, your feedback is genuinely welcome and thank you for helping me make this software the best version of itself before sharing to the masses.

dteske marked 2 inline comments as done.Jun 8 2017, 1:52 AM

Fixed typo as part of last update. Marking inline comments as done.

markj accepted this revision as: markj.Jun 9 2017, 6:37 PM

I've spent some more time playing with the revised dwatch and found it rather more intuitive than before, and the core seems a fair bit simpler as well. I'll try to make use of it instead of dtrace(1) the next time the need arises. That said, I don't have any real comments on the implementation. I'll point out that dtrace recently gained support for if-statements, which could potentially be used to simplify some of the profile actions at the expense of losing portability to versions that don't have it.

cddl/usr.sbin/dwatch/dwatch
267

Why bother with this? dtrace(1) already auto-loads the dtraceall KLD.

cddl/usr.sbin/dwatch/dwatch.1
66

point*

416

*generated

so, like is this getting committed some time?

In D10006#206784, @gnn wrote:

New, BSD license, code does not need to be in cddl. Most of the new D scripts and programs go into either share/dtrace or the DTrace Toolkit port. Since this is a script I'd put it into share/dtrace if you want it in the src tree.

I'd agree with this... My gut feeling is that I'd look for it in /usr/share/dtrace or maybe /usr/share/examples/dtrace, probably the former. where it is on porridge (/usr/share/dtrace/watch_kill)

so, like is this getting committed some time?

After the presentation in April at Netflix [1], I redesigned the syntax and spent months reworking the code. That resulted in the last update to this review. The feedback after that caused another iteration wherein just the last touchups mentioned by markj were incorporated.

There was then a brief holdover for a couple months wherein another FreeBSD developer working on tracing for bhyve virtual machines shared a future change in flags to dtrace(1) which would have potentially introduced some conflicts with the conservative option arguments selected for dwatch(1) -- specifically chosen to align with dtrace(1). The work by dstolfa stalled and he later said I should not wait for his work. When his work is finished later on down the road (adding `-M machine' option argument to dtrace(1) for specifying a bhyve virtual machine to trace), I can simply update dwatch to align at that time.

I am re-presenting the fully matured dwatch code to a larger audience [2] and will cull feedback once again. I don't expect any changes after this next round.

The next update to this review will incorporate any feedback I get from the Women In Linux Summit this Saturday [2].

Materials for the upcoming presentation (in 72 hours from the time of this writing) are published online [3] but are not yet finished. Currently custom artwork is being composed for the presentation slides and will be incorporated hopefully soon.

Last, but not least, the third beta package will be produced after the WILSummit (on Aug 19) and added to the FrauBSD package repository [4]

[1] https://youtu.be/DdPCtNH4k0w
[2] https://womeninlinuxsummit20173376.sched.com/event/BNd6/introducing-dwatch-the-ultimate-dtrace-tool
[3] https://fraubsd.org/doc/wilsummit/2017/
[4] https://pkg.fraubsd.org/

In D10006#206784, @gnn wrote:

New, BSD license, code does not need to be in cddl. Most of the new D scripts and programs go into either share/dtrace or the DTrace Toolkit port. Since this is a script I'd put it into share/dtrace if you want it in the src tree.

I'd agree with this... My gut feeling is that I'd look for it in /usr/share/dtrace or maybe /usr/share/examples/dtrace, probably the former. where it is on porridge (/usr/share/dtrace/watch_kill)

Don't confuse dwatch with watch_* simply because they both have "watch" in the name.

Further, dwatch is not a single file. For example, installing only dwatch to $PATH won't give you the same functionality of any watch_* script. You would need the companion files named "profiles" which dwatch can load from it's own module-path (colon-separated list of directories it searches in). So now if dwatch and all its ilk live in /usr/share/dtrace it now becomes a matter of copying dozens of files to more than one location through multiple commands before you can make use of it. That to me violates the definition of an "example" or "simple script" -- dwatch is in-fact a framework. This much cannot be said for elements of the DTrace toolkit which doesn't have a centralized abstracted API on top of DTrace, but rather you take the script you want (e.g., iosnoop) and copy it to $PATH and run it (implied making it executable if not already).

wblock added a subscriber: wblock.Aug 25 2017, 4:26 PM
wblock added inline comments.
cddl/usr.sbin/dwatch/dwatch.1
66

"hits" sounds a little imprecise, but I'm not sure of a good replacement. "executes" or "encounters", maybe.

96

This could benefit with a "the":

Output format with the
118

"the", as above.

307
The format is
319

Not sure whether "uid" should be capitalized.

325
Report
.Nm
version on standard output and exit.
346

I would suggest just "This" rather than below, because the list starts immediately after that sentence.

349

Should this be an .Xr chmod 2 ?

Same question for the rest of these.

384

Use

If
.Ev DWATCH_PROFILES_PATH
is set,
386

s/will search/searches/

And it doesn't really say that the list of directories comes from that variable, so:

If
.Ev DWATCH_PROFILES_PATH
is set,
.Nm
searches for profiles in the colon-separated list of directories in that variable
390
profiles are not loaded.
399

Examples usually do not use the synopsis markup, but indent them to make them stand out. Here is one from gpart.8:

We create a 472-block (236 kB) boot partition at offset 40, which is
the size of the partition table (34 blocks or 17 kB) rounded up to the
nearest 4 kB boundary.
.Bd -literal -offset indent
/sbin/gpart add -b 40 -s 472 -t freebsd-boot ada0
/sbin/gpart bootcode -p /boot/gptboot -i 1 ada0
.Ed
.Pp
dteske marked 15 inline comments as done.Oct 6 2017, 2:08 AM
dteske updated this revision to Diff 35304.Thu, Nov 16, 12:00 AM

Updates incorporating feedback and bug fixen.

+ Update ObsoleteFiles.inc patch to apply cleanly to HEAD
+ Syntax changes:

  • Change -M name to -X profile to make room for future -M machine
  • Add -o output for sending output to a file
  • Add -O cmd for executing a command for each event

+ Minor code cleanups
+ Eliminate unnecessary kldstat check for dtrace kernel module
+ Additional stderr filtering in non-verbose mode
+ Fix regex backslash handling for -r regex and -z regex
+ Add ANSI color to output when stdout is to a console
+ ANSI highlights for -g group, -p pid, -r regex, -u user, -z regex
+ Improvements and bug fixes to above-mentioned options
+ Man-page updates

dteske retitled this revision from Add dwatch(8) for watching processes as they enter dtrace probe to Add dwatch(8) for watching processes as they trigger dtrace probe.Thu, Nov 16, 12:04 AM

If you have pkg.fraubsd.org configured, you can say "pkg upgrade dwatch" to try this code.