Introduce dwatch(1) as a tool for making dtrace more useful
Details
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
- Lint Passed 
- Unit
- No Test Coverage 
- Build Status
- Buildable 8903 - Build 9291: CI src build - Jenkins - Build 9290: arc lint + arc unit 
Event Timeline
Incorporate much feedback from previous comments on this review
Update patch to apply cleanly to head
Make lchmod a hard link to chmod
Make vop_{lookup,mkdir,mknod,remove,rmdir} a hard link to vop_create
Change terminology to name dwatch(8) modules to instead profiles
NB: Prevents term modules from being confused with dtrace(1)
Improve readability and simplify code
Move -e code option to -D code for injecting DTrace event detail code
Move -p action option to -e name for modifying the event action
Move -M option to -p for disabling profiles (formerly termed modules)
Update usage statement to be more clear about what a "probe" is
NB: Instead of "probe" show that we expect "[provider:[module:]]function"
Simplify profile loading
Set $FILE $PROFILE and $PROBE before loading profile(s)
Consolidate file descriptor operations
Fix a bug where -u root or -g root did not work as expected
NB: Also fixed -u 0 and -g 0 which suffered from the same issue
Add further clarification to the manual page with respect to probe format
Add examples to the manual page
Move examples/module_template to examples/profile_template
Improve comments in examples/profile_template
Reduce size of profiles, removing comments inherited from profile_template
Add numerical suffixes in nanosleep profile (s for seconds; ns for nanosec)
Add support for -l -e name [pattern] syntax
NB: The -l [pattern] syntax did not support -e name before
I wanted to take a moment to explain why the probe syntax for dwatch(8) is "[provider:[module:]]function" and not "[provider:[module:[function:]]]name" as is the probe syntax for dtrace(1).
The dwatch(8) utility aims to simplify access to dtrace(1) and a quick tally of names available from dtrace -l shows that entry is by far the most common name, making a simplified dwatch function the ideal quick-style invocation. See code snippet below:
dteske@nplusfreebsd ~ $ sudo dtrace -l | awk 'N[$NF]++{}END{for(name in N)printf "%8u %s\n", N[name], name}' | sort -bn | awk '$1>9'
      83 start
      94 done
     120 mac-check-err
     120 mac-check-ok
     398 free
     398 malloc
   29487 return
   33643 entryIn second-place is return but it may produce unexpected results to walk the td_proc or vnode chain(s) while in a return event. All of the remaining names combined amount to less than 5% of the entry names available in dtrace -l.
dteske@nplusfreebsd ~ $ sudo dtrace -l | awk 'NR==1{next} $NF=="return"{next} $NF=="entry",++N[$NF]{next} {N["others"]++} END{for(name in N)printf "%8u %s\n", N[name], name; printf "(others * 100) / entry = %4.2f%% (entry = %4.2f%% of probes minus return)\n", p = (N["others"] * 100) / N["entry"], 100 - p}' | sort -bn
(others * 100) / entry = 4.17% (entry = 95.83% of probes minus return)
    1404 others
   33643 entryBut recognizing the need for access to the others, I have added a -e name option (formerly -p action before last update to diff) for providing access to less common names. I have also given an example of how to select something like sched:::on-cpu in the manual under the new EXAMPLES section, such as dwatch -e on-cpu sched::
But I primarily wanted to make it easiest to access the one name that takes up 95.83% of the available dtrace(1) names, so -e name defaults to a name of entry if not given -- ultimately paired up with the non-option argument that follows in the format of [provider:[module:]]function (which now replaces the word "probe" in the manual synopsis and short usage available with -h).
Cheers!
Add "-t test" option for customizing dtrace(1) predicate
Add "-j jail" option for limiting events to a jail name/jid
Change "-p" option (disable profiles) to "-P"
Add "-p pid" option for watching a particular process id
When given "-D-" perform all error checks before reading stdin
Issue an error and exit if group in "-g name" does not exist
Fix spurious double-error when given bad user in "-u name"
Given "-u user" and "-g group" but not "-q", display user before group
Use walltimestamp instead of timestamp for event date/time
Very minor refactoring for code readability
Make ending semi-colon for "-D code" optional
Whitespace and comments
Minor fixes to grammar and punctuation in manual
Sort examples in manual by flag being demonstrated
Add the following examples to manual:
	dwatch -f '(mk|rm)dir' execve
	dwatch -g wheel execve
	dwatch -j 0 execve
	dwatch -j myjail execve
	dwatch -l 'read$'
	dwatch -p 1234 execve
	dwatch -q -t 'arg2<10' -D 'printf("%d",arg2)' write
	dwatch -v -p 1234 execve
Update patch to apply cleanly to HEAD
Fix generation of ellipsis for trailing arguments
Remove double-quotes on $COUNT which is guaranteed to be a number
Update to apply cleanly to HEAD
Add "-x" option to enable probe tracing
Don't set caller_execname unless hooked on syscall::execve
Fix indentation in generated dtrace(1) code
Beautify generated dtrace(1) code
Fix nested DTrace predicates; "-t test" now works with profiles
Postpone jail predicate generation until after profile loading
Fix a missing ".Ar" in dwatch(8) manual
Update examples/profile_template to include EVENT_TEST info
Change "kill" profile to print unmodified pid_t argument
Fix signedness in printf for signal argument in "kill" profile
Add code to "vop_*" profiles to NULL-ify transient variables
Optimize vnode-walking process in VFS profiles
Fix ellipsis generation for paths exceeding DEPTH_MAX in VFS profiles
Update VFS profiles to use EVENT_TEST instead of EVENT for predicates
Fix incorrect "probe ID" comments generated by vop_rename profile
Complete overhaul in an effort to address earlier feedback.
The usage and syntax and many flags have been altered.
Below is an itemized list of individual targeted changes:
+ Move manpage from section 8 to section 1 to mirror dtrace(1)
+ Updated usr mtree dist file to include /usr/share/examples/dwatch
+ Updated the man-page:
- Title updated
- Synopsis updated
- Syntax options expanded to match dtrace(1) manual
- Made to pass "mdoc -Tlint" and "igor -Dgpxy" cleanly
- Remove extraneous .Pp before .Bl
- Remove "the following" in various locations
- Remove "simple" and "simply" in a couple cases
- Move the EXAMPLES section below the ENVIRONMENT and EXIT STATUS sections
- Fix a typo s/lchmod/nanosleep/ in PROFILES section
- Add description before options
- Moved options to their own section
- Updated option descriptions
- Add many new examples (one for each flag) and update existing
+ Syntax changes:
- The probe format now matches that of dtrace(1)
- Options -e, -f, -F, -l, -m, -n, -p pid, -P, -q, -v, -V, -w, and -x do the same thing compared to dtrace(1)
- No longer any conflicting options when compared to dtrace(1) syntax
- The -e name option argument has been deprecated
- The -h option has been deprecated, matching dtrace(1)
- Options -d, -g group, -j jail, -l, -p pid, -q, -t test, -u user, and -x are the same as the previous version
- Options -g group, -p pid, and -u user can now take a regular expression
- Options -1, -e (no argument), -f (no argument), -F, -k name, -Q, -T time, -V, -w, and -y have been added
- Option -c count is now -N count
- Option -D code is now -E code
- Option -f regex is now -z regex
- Option -m num is now -B num
- Option -n num is now -K num
- Options -m, -n, -P, and -v have been repurposed to match dtrace(1)
- Loading of modules has been disabled by default (formerly you had to use -P to disable loading of modules) and -M name has been added to enable loading a module of given name
+ Added ANSI coloring to output when stdout is to a terminal
+ Added mathemagical probe selection algorithm for expanding unqualified probes
- If a probe does not contain a colon (:) and none of the options -P (provider), -m (module), -f (function), or -n (name) are given to indicate the probe type, this algorithm uses mathematics to determine the most likely choice and expands it to a fully-qualified probe (of the format [provider]:[module]:[function]:[name]
+ Add DTRACE_PRAGMA to profile_template (in examples)
+ Update profile_template and profile PROBE settings to allow for inheritance
+ Make use of probefunc in profiles when we need to display function name
+ Add a one-line flag (-1) to prefer single-line output (vs -R for tree)
+ Add function-trace flag (-F) which works same as dtrace(1)
+ Add -P, -m, -f, and -n flags for specifying probe type
+ Add -k name option argument for making the trigger only fire when execname matches the given name, name*, *name, or *name* supported formats
+ Add -Q flag for listing querying available profiles
+ Add -T time option argument for enabling timeout
+ Add a version flag (-V) for checking the version of dwatch
+ Add -w flag for enabling destructive capabilities (enables dtrace -w)
+ Add -y flag to always enable color even if stdout is not a terminal
+ Filter dtrace(1) errors (stderr output) unless -v option is given
+ Add the ability to list unique providers vs modules vs functions vs names
- Use -lP, -lm, -lf, or -ln respectively
+ Don't list profiles in the standard syntax usage statement (moved to -Q)
+ Allow more than one probe to be given
+ Allow options to be specified after probe argument(s)
+ Support cascading predicates
I am running an open public beta of this software. To test the software that is currently submitted here for review, head over to pkg.fraubsd.org in your browser, pick an architecture, and either follow the instructions for setting up the FrauBSD pkg repository and say "pkg install fraubsd/dwatch" or download the dwatch tarball and install with "pkg add PKGFILENAME.txz"
Please give dwatch another try.
I've taken what you said to-heart and completely re-architected the usage to be more like dtrace.
In fact, I've re-designed the syntax of dwatch expressly to teach people how to use dtrace and also foster good-habits when it comes to translating dwatch syntax to dtrace or vice-versa. In many ways you can now use dwatch as you would dtrace and vice-versa, using the same probe names and same probe-type indicator flags (-P, -m, -f, and -n).
The first thing I tried (without having fully read the manpage) was "dwatch sched:::on-cpu", which of course gives an error.
You should now be able to say "dwatch sched:::on-cpu" (or even "dwatch on-cpu" if you like).
I don't see any non-trivial implementations for ACTIONS.
The vop_* profiles have huge ACTIONS that are quite complex.
It's not really clear to me how it might be used.
I've added tons of new examples to the manual to help spark ideas.
As always, your feedback is genuinely welcome and thank you for helping me make this software the best version of itself before sharing to the masses.
I've spent some more time playing with the revised dwatch and found it rather more intuitive than before, and the core seems a fair bit simpler as well. I'll try to make use of it instead of dtrace(1) the next time the need arises. That said, I don't have any real comments on the implementation. I'll point out that dtrace recently gained support for if-statements, which could potentially be used to simplify some of the profile actions at the expense of losing portability to versions that don't have it.
| cddl/usr.sbin/dwatch/dwatch | ||
|---|---|---|
| 267 | Why bother with this? dtrace(1) already auto-loads the dtraceall KLD. | |
| cddl/usr.sbin/dwatch/dwatch.1 | ||
| 65 ↗ | (On Diff #29314) | point* | 
| 415 ↗ | (On Diff #29314) | *generated | 
I'd agree with this... My gut feeling is that I'd look for it in /usr/share/dtrace or maybe /usr/share/examples/dtrace, probably the former. where it is on porridge (/usr/share/dtrace/watch_kill)
After the presentation in April at Netflix [1], I redesigned the syntax and spent months reworking the code. That resulted in the last update to this review. The feedback after that caused another iteration wherein just the last touchups mentioned by markj were incorporated.
There was then a brief holdover for a couple months wherein another FreeBSD developer working on tracing for bhyve virtual machines shared a future change in flags to dtrace(1) which would have potentially introduced some conflicts with the conservative option arguments selected for dwatch(1) -- specifically chosen to align with dtrace(1). The work by dstolfa stalled and he later said I should not wait for his work. When his work is finished later on down the road (adding `-M machine' option argument to dtrace(1) for specifying a bhyve virtual machine to trace), I can simply update dwatch to align at that time.
I am re-presenting the fully matured dwatch code to a larger audience [2] and will cull feedback once again. I don't expect any changes after this next round.
The next update to this review will incorporate any feedback I get from the Women In Linux Summit this Saturday [2].
Materials for the upcoming presentation (in 72 hours from the time of this writing) are published online [3] but are not yet finished. Currently custom artwork is being composed for the presentation slides and will be incorporated hopefully soon.
Last, but not least, the third beta package will be produced after the WILSummit (on Aug 19) and added to the FrauBSD package repository [4]
[1] https://youtu.be/DdPCtNH4k0w
[2] https://womeninlinuxsummit20173376.sched.com/event/BNd6/introducing-dwatch-the-ultimate-dtrace-tool
[3] https://fraubsd.org/doc/wilsummit/2017/
[4] https://pkg.fraubsd.org/
Don't confuse dwatch with watch_* simply because they both have "watch" in the name.
Further, dwatch is not a single file. For example, installing only dwatch to $PATH won't give you the same functionality of any watch_* script. You would need the companion files named "profiles" which dwatch can load from it's own module-path (colon-separated list of directories it searches in). So now if dwatch and all its ilk live in /usr/share/dtrace it now becomes a matter of copying dozens of files to more than one location through multiple commands before you can make use of it. That to me violates the definition of an "example" or "simple script" -- dwatch is in-fact a framework. This much cannot be said for elements of the DTrace toolkit which doesn't have a centralized abstracted API on top of DTrace, but rather you take the script you want (e.g., iosnoop) and copy it to $PATH and run it (implied making it executable if not already).
| cddl/usr.sbin/dwatch/dwatch.1 | ||
|---|---|---|
| 65 ↗ | (On Diff #29314) | "hits" sounds a little imprecise, but I'm not sure of a good replacement. "executes" or "encounters", maybe. | 
| 95 ↗ | (On Diff #29314) | This could benefit with a "the": Output format with the | 
| 117 ↗ | (On Diff #29314) | "the", as above. | 
| 306 ↗ | (On Diff #29314) | The format is | 
| 318 ↗ | (On Diff #29314) | Not sure whether "uid" should be capitalized. | 
| 324 ↗ | (On Diff #29314) | Report .Nm version on standard output and exit. | 
| 345 ↗ | (On Diff #29314) | I would suggest just "This" rather than below, because the list starts immediately after that sentence. | 
| 348 ↗ | (On Diff #29314) | Should this be an .Xr chmod 2 ? Same question for the rest of these. | 
| 383 ↗ | (On Diff #29314) | Use If .Ev DWATCH_PROFILES_PATH is set, | 
| 385 ↗ | (On Diff #29314) | s/will search/searches/ And it doesn't really say that the list of directories comes from that variable, so: If .Ev DWATCH_PROFILES_PATH is set, .Nm searches for profiles in the colon-separated list of directories in that variable | 
| 389 ↗ | (On Diff #29314) | profiles are not loaded. | 
| 398 ↗ | (On Diff #29314) | Examples usually do not use the synopsis markup, but indent them to make them stand out. Here is one from gpart.8: We create a 472-block (236 kB) boot partition at offset 40, which is the size of the partition table (34 blocks or 17 kB) rounded up to the nearest 4 kB boundary. .Bd -literal -offset indent /sbin/gpart add -b 40 -s 472 -t freebsd-boot ada0 /sbin/gpart bootcode -p /boot/gptboot -i 1 ada0 .Ed .Pp | 
Updates incorporating feedback and bug fixen.
+ Update ObsoleteFiles.inc patch to apply cleanly to HEAD
+ Syntax changes:
- Change -M name to -X profile to make room for future -M machine
- Add -o output for sending output to a file
- Add -O cmd for executing a command for each event
+ Minor code cleanups
+ Eliminate unnecessary kldstat check for dtrace kernel module
+ Additional stderr filtering in non-verbose mode
+ Fix regex backslash handling for -r regex and -z regex
+ Add ANSI color to output when stdout is to a console
+ ANSI highlights for -g group, -p pid, -r regex, -u user, -z regex
+ Improvements and bug fixes to above-mentioned options
+ Man-page updates
If you have pkg.fraubsd.org configured, you can say "pkg upgrade dwatch" to try this code.
Code updates:
- Make -F option show default details
- Fix -F option to work with -E code
Manual updates:
- The -F option shows probe traversal, not just functions
Code changes:
- Allow "-E code" to be passed multiple times
- Make "-F" use same glyphs as dtrace(1) (e.g., "->" instead of "=>")
Manual changes:
- Wordsmith the initial description paragraph for clarity
- Move sudo comment to top (prevents confusion w/ neighboring text)
- Update text describing "-F" output format
- Fix backslashes in example of "-F" output
- Other wordsmithing
- Fix a quoted parenthetical
- Note that "-E code" and "-k name" can be specified multiple times
- Fix trailing whitespace
New profiles added:
- errno for watching syscall returns with non-zero errno
- io for watching dtrace_io(4)
- io-start and io-done links to io
- ip for watching dtrace_ip(4)
- ip-send and ip-receive links to ip
- open for watching open[at](2) syscall
- openat links to open
- proc for watching dtrace_proc(4) execution
- links to proc: proc-create proc-exec proc-exec-failure proc-exec-success proc-exit
- proc-signal for watching dtrace_proc(4) signals
- links to proc-signal: proc-signal-clear proc-signal-discard proc-signal-send
- rw for watching read(2)/write(2) syscalls
- read and write links to rw
- sched for watching dtrace_sched(4)
- links to sched: sched-change-pri sched-dequeue sched-enqueue sched-lend-pri sched-load-change sched-off-cpu sched-on-cpu sched-preempt sched-remain-cpu sched-surrender sched-sleep sched-tick sched-wakeup
- tcp for watching dtrace_tcp(4) connections
- links to tcp: tcp-accept-established tcp-accept-refused tcp-connect-established tcp-connect-refused tcp-connect-request tcp-accept tcp-connect tcp-established tcp-refused
- tcp-io for watching dtrace_tcp(4) I/O
- tcp-send and tcp-receive links to tcp-io
- udp for watching dtrace_udp(4) I/O
- udp-send and udp-receive links to udp
Changes to existing profiles:
- Improve chmod
- Deprecate fchmod
- Make fchmodat a symlink to chmod
- Make chmod intelligent regarding probefunc
- Make chmod default to watching [l]chmod and fchmodat
- Remove double-quotes in probe assignments
- Fix ID comments (affects -d)
- Front-load all information we intend to dump to thread-local vars
- Put front-loaded info into its own action (affects "-t test")
- Simplify nanosleep to display just requested time
- Comments and whitespace
Makefile changes:
- Update cddl/usr.sbin/Makefile to include dwatch subdir when including dtrace
- New profiles/links added to libexec/Makefile
- Sort profile links alphabetically in libexec/Makefile
- Remove duplicate vop_lookup link in libexec/Makefile
Code changes to dwatch(1):
- Eliminate unnecessary use of awk(1) to generate literal ESC
- Rename variable FUNC_COALESCE to PROBE_COALESCE for clarity
- Clarify in usage that "-B num" is for process argument max
- Change syntax to allow dtrace(1) args to follow probe (allows profiles to use $target for special attachments)
- Fix unused variable(s)
- Make list_*() like usage() and exit before return
- Make list_*() take options instead of using parent namespace
- No need to abstract out static sort in list_probes()
- Use echo unless printf is really necessary
- Sort awk(1) variable arguments
- Use escaped regex for match in list_profiles()
- Use same variable name in sh(1) and awk(1) namespace
- Use properly escaped regex var in list_profiles()
- ANSI highlighting for regex in list_*()
- Allow "-t test" to be passed multiple times
- Update main synopsis in manual
- Update copyright years
- Don't show probefunc (user can use -F to do that)
- Change usage()/manual entry "-o output" to "-o file"
- Hide "out of scratch space" errors (show with -v)
- Strip ANSI from $TAG/$DETAILS when executing "-O cmd"
- Refactor argument processing (can be disabled by unsetting PSARGS)
- Make pproc() capable of working on any struct proc *
- Front-load all information we intend to dump to thread-local vars
- Put front-loaded info into its own action (affects "-t test")
- Comments and whitespace
Manual changes:
- Doc style: the reader can see "the output" without being told
- Add new examples
- Wordsmithing
@gnn: Excited as I am for the accept, I have one last final (big) change to finalize the code before commit.
To tie this closer to base and leverage more of the work of others, I have been moving bits of dwatch into cddl/lib/libdtrace/*.d to centralize the generic logic required for enumerating data.
Specifically, I have already made these two changes:
D14340: Add errno definitions to /usr/lib/dtrace/errno.d
D14386: Add inline to errno.d for translating int to string
And the following change is currently in-review:
D14396: io.d updates/enhancements to aid DTrace scripting
And after that is complete, I have just a couple minor additions to signal.d for which I have not yet filed a review.
When D14396 is complete and the next edit (to signal.d) is complete, I will then update this review (D10006) with the final (truly final) dwatch code that I intend to commit.
@gnn if you could help markj with D14396 and the upcoming signal.d change, it would really help speed along D10006.
Thank you again for all that you have done.
Cheers
New profiles added:
+ proc-status for proc:::{create,exec{,-failure,-success},exit}
+ sched-cpu for sched:::{off,on,remain}-cpu
+ sched-exec for sched:::{sleep,wakeup}
+ sched-pri for sched:::{change,lend}-pri
+ sched-queue for sched:::{{de,en}queue,load-change}
+ tcp-init for tcp:::{accept-{established,refused},connect-{established,refused,request}}
+ tcp-state-change support to tcp profile
+ tcp-status for tcp:::{accept-{established,refused},connect-{established,refused,request},state-change}
Changes to existing profiles:
+ Add thread-local variable declarations
+ Add typecast to non-string probe arguments
+ Always use static probe selectors when dereferencing struct arguments
+ Append b_error when BIO_ERROR status detected from dtrace_io(4)
+ Clean up proc with inline probealias
+ Clean up tcp with inline probeflow and srclocal
+ Comments and whitespace
+ Display all bio_flags bitfields from dtrace_io(4)
+ Don't use transient this->proc from dwatch in sched profile
+ Fix erroneous details (not zeroed from one probe to next) in proc, sched
+ Fix grammar (byte vs bytes) for ip, tcp, udp
+ Ignore dtrace_io(4) events lacking device name
+ Include proc:::signal-{clear,discard,send} events for proc
+ Include tcp:::{receive,send,state-change} events for tcp
+ Insert "--" between pid and proc args in sched
+ Integrate proc-signal into proc
+ Integrate tcp-io into tcp
+ Locate flow to start of line for dtrace_io(4) events
+ Name thread-local variables after man-page arguments when available
+ Protect io profile from NULL deref against struct bio * arg0
+ Refactor errno and proc to rely on svn r329334 r329353
+ Refactor io to rely on svn r329914
+ Refactor proc-signal (now part of proc) to rely on svn r329995
+ Remove brackets around any/all byte values to make parsing easier
+ Replace dash after probename sched:::{en,de}queue with space
+ Sort probe selector values alphabetically
+ Use sensible defaults for flow indicators for forward compatibiilty
+ Use thread-local buffer for disparate probe details
Makefile changes:
+ Sort symlinks alphabetically [libexec/Makefile]
Code changes to dwatch(1):
+ Fix bad probe identifier comments from dwatch (affects -d)
+ Fix regex escaping (affects -r) when -q is used
+ Hide "Bus error" output from dtrace stderr (show with -v)
+ Protect dwatch from NULL deref when curthread has no parents
| ObsoleteFiles.inc | ||
|---|---|---|
| 41 | I will bump this date prior to commit to be the day of the commit. | |