I ran the tests with this applied on amd64, had a few kmods load/unload concurrently for a while and looked through the concurrency around the patching code. Everything seems to work fine on my end.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
May 2 2024
Apr 22 2024
Apr 21 2024
@avg It looks like this hit stable as well so will also need to be reverted before the release is cut: https://cgit.freebsd.org/src/commit/?h=stable/14&id=fb9c50f983ff6bdd6f33a22ae7d5b391435dd02a
Apr 7 2024
Looking at the code a little bit further, it seems that
@avg This seems to introduce a kernel panic when -x bufpolicy=ring is used:
Feb 26 2024
Jan 2 2024
Missing comma in man page.
Address @markj's comments.
Dec 8 2023
Update the diff to address a few output alignment issues when not using structured output via xo_emit(). Without the added xo_flush() calls, the output from xo_emit() would be printed after all the other output is printed via fprintf(). Also added a comment explaining why xo_flush() is called in those situations.
Nov 1 2023
That's pretty terrible, given that all flavors of xo_emit* turn into the same codepath (xo_do_emit). Can you isolate this into something I can debug, or send me your current patch (phil@freebsd.org) and I'll see if I can do it?
@phil I've tried to do a basic implementation using xo_create_to_file and then implementing dt_emit() by redirecting buffered output and sprintf output to the regular printf output (it becomes dt_vprintf since i have a va_list instead of variadic arguments). However, I've noticed that if I do regular text output using
Oct 20 2023
Oct 19 2023
Update the diff. This diff should address the following:
Oct 18 2023
Sep 8 2023
Add missing information to the man page and fix some bits in it (e.g. using .Fn for action names).
Updated the diff with documentation in the dtrace(1) man page, as well as a bug fix when it comes to naming aggregations. Namely, min, max, sum and count were all called count in the final output due to missing checks. These are now addressed.
Sep 7 2023
In D41745#951858, @stephane.rochoy_stormshield.eu wrote:I suggest to think about adding --libxo foo as an alias to -x oformat=foo to make things more inline with what is actually done for, e.g., netstat(1).
Sep 6 2023
Attempt #2 at addressing @markj's comments... Forgot one gettimeofday
Address some comments by @markj. The man page comment is still true, as I will be updating that when all the documentation comes in.
I don't see a reason to avoid adding it to dtrace.1?
Sep 5 2023
Sep 4 2023
That's true, it's late :). Updated the diff. Thanks!
Aug 25 2023
Jul 25 2023
Add a missing newline to the option file and regenerate src.conf.5.
Remove a leftover .\".
Address the comments by markj and emaste.
Jul 24 2023
Add tools/build/options/WITH_DTRACE_ASAN and update src.conf.
Jan 24 2023
Dec 7 2022
Tested it on my end and it works. Code also looks good to me. Thanks!
Mar 5 2022
I've read through the man page changes twice and couldn't find anything wrong. LGTM but maybe a read through by someone from docs would be a good idea.
Mar 2 2022
Both seem reasonable to me. I'll aim to run some more tests with this and look it over one more time in detail by Friday and flag if I find anything, but overall looks good to me. Thanks for working on this!
Overall the patch looks good to me. I've run all of this under ASAN + UBSAN (with the full patch set applied) and nothing was flagged in this code as problematic when running the DTrace test suite and the FreeBSD build. I feel like it might be a good idea to properly restructure the code around CTFv2 and CTFv3 to account for other versions in the API rather than relying on the "else" case always being v3, but aside from that LGTM. @markj: what do you think about the alignment warning/error propagation?
Feb 24 2022
The code looks good to me, but it probably needs man page changes like @debdrup mentioned. I also wonder if adding a comment at the top of the header somewhere simply stating what CTFv2 and CTFv3 are and how they differ at a high level makes sense. Perhaps in one or two short sentences so that if someone unfamiliar with CTF wants to simply use the header knows what they're working with without needing to read through a fairly detailed man page? Do you have any thoughts on this?
In D34358#778329, @markj wrote:In D34358#778319, @domagoj.stolfa_gmail.com wrote:The original CTF header has a bunch of useful comments describing the format. Perhaps these things should be included in this header too? While I appreciate that this is an import, those comments have helped me quite a bit when dealing with CTF and IMO should be included.
There is a ctf.5 man page that describes the format in much more detail than the comments did. I haven't updated it yet for v3 but will do so. Do you think it is sufficient to reference the man page here?
This looks good to me. Thanks!
The original CTF header has a bunch of useful comments describing the format. Perhaps these things should be included in this header too? While I appreciate that this is an import, those comments have helped me quite a bit when dealing with CTF and IMO should be included.
Dec 10 2021
Sep 23 2021
In D31494#723995, @jhb wrote:In D31494#723984, @mhorne wrote:In D31494#720213, @domagoj.stolfa_gmail.com wrote:Thanks for writing this up @mhorne, much appreciated! I've left a few comments from my experience of setting it up with QEMU + KVM today and the confusion I had while reading the existing documentation, as well as the confusion that I would probably have while reading this.
I wonder if it might make sense to explicitly state that the procedure for virtualization gdb stubs may be different depending on which hypervisor/emulator is being used, and then in follow-up commits (by you or anyone else really) document the procedure for each given hypervisor?
Right, the process for remote gdb is a little different when you are using a hypervisor, mainly because the hypervisor can pause execution automatically when it detects that a client has connected to its gdb stub. For the gdb stub implemented by the kernel, we must force a trap into the debugger in order to detect that a client has detected. That is why this chapter lists extra steps that you did not need.
I didn't come up with any text to really highlight this difference, but I think it will be easier to do once we add a separate [sub]section for remote gdb using bhyve/kvm/whatever. You can suggest specific text for the intro paragraph if you like.
Yes, I think we will want to add a separate section for bare-metal debugging with a hypervisor vs using serial to talk to the in-kernel stub which is what this currently documents.
Sep 12 2021
Thanks for writing this up @mhorne, much appreciated! I've left a few comments from my experience of setting it up with QEMU + KVM today and the confusion I had while reading the existing documentation, as well as the confusion that I would probably have while reading this.
Sep 4 2021
@jhb Any chance you could push this? I've been using this for months and it's worked well for me.
Jun 16 2021
In D30778#692250, @markj wrote:In D30778#692214, @domagoj.stolfa_gmail.com wrote:In D30778#691982, @markj wrote:If there are any tags, e.g., sponsored by, please add them to the review description and I'll commit.
You can add a Signed-off-by: domagoj.stolfa@gmail.com if you'd like
I prefer not to since we don't have a policy around it, at least not yet. If you prefer to have it, then I'll keep it.
In D30778#691982, @markj wrote:If there are any tags, e.g., sponsored by, please add them to the review description and I'll commit.
Jun 15 2021
May 27 2021
This seems reasonable to me.
May 16 2021
I've been using this locally and found it *extremely* useful, and frankly was quite surprised when I found out that EV_ENABLE/EV_DISABLE trashed udata originally.
May 13 2021
In D30255#679519, @sef wrote:I still use an 80 character limit, and an 80 character-wide Terminal window.
Usually my opinion means everyone else has moved on :).
Remove a whole bunch of noise compared to the previous diff.
Update the diff with -U999999
Apr 1 2021
@markj 's comments.
Added -x libdir and -x syslibdir as suggested by @markj
Mar 27 2021
In D29435#659665, @markj wrote:(Please let me know if you'd like me to commit this.)
Mar 26 2021
Aug 27 2018
In D16921#360817, @markj wrote:In D16921#360786, @domagoj.stolfa_gmail.com wrote:In D16921#360785, @markj wrote:I think it would be nicer to instead keep all tracepoints for a given address on a linked list, with a single hash table entry for all of them. fbt_invop() would invoke dtrace_probe() for each probe associated with the tracepoint, which should be fine so long as there aren't many. (With ifuncs there will be 2.)
Perhaps we could even bunch them into a bounded array (or a vector-like implementation) and check it at fbt load-time in order to avoid indirection? A linked list is unlikely to be too painful given all the membars and indirection happening in dynamic variable implementation -- but nonetheless would degrade performance :-).
I think it's acceptable for the time being since, as I pointed out, there are at most 2 probes for a given tracepoint currently. The FBT hash table is itself cache-unfriendly, and I'd rather fix the performance problems there holistically. Here's the distribution of hash chain lengths on a system with a fairly stripped-down kernel:
5395 0 9755 1 8748 2 5244 3 2369 4 876 5 280 6 73 7 24 8 3 9 1 10
In D16921#360785, @markj wrote:I think it would be nicer to instead keep all tracepoints for a given address on a linked list, with a single hash table entry for all of them. fbt_invop() would invoke dtrace_probe() for each probe associated with the tracepoint, which should be fine so long as there aren't many. (With ifuncs there will be 2.)
Why "additional"? The problem exists regardless since fbt_invop() returns after the first matching tracepoint. The problem was invisible before ifuncs since before that all tracepoints had distinct addresses. This patch just hides the problem in the common case.
I'm a little worried that this may cause additional confusion if someone expects multiple probes to fire at that point. I wonder if a better workaround would be to just add a warning message whenever DTrace is called, or perhaps when someone calls dtrace -l, in the man page or something along the lines until we fix? Is there a reason we can't fix this in 13 and MFC the changes?
May 8 2018
Preemptively remove a redundant check for INKERNEL(fp).
In D15359#323676, @andrew wrote:Why not just exclude it in fbt_provide_module_function? I'd prefer we keep the unwinding code together, maybe with a comment that it may be called from such a context.
In D15359#323666, @markj wrote:however, in the future a patch that enabled instrumentation of inlined functions could accidentally break DTrace on aarch64.
Was the problem found while working on such a patch? :)
Apr 30 2018
Had a quick skim, but LGTM.
Apr 21 2018
Apr 12 2018
Is this good to land? Thanks!
Apr 6 2018
Address the other comments by @markj and fix a misleading word in the dtrace_probe_exit() comment.
Address comments by @markj.
Apr 4 2018
Apr 3 2018
Is there anything else I can do for this to land?
Mar 28 2018
Should I make any more changes for this to land? As an aside note: I've tried to make the interface as simple as possible to use with the args[0-2] being consistent.
Mar 27 2018
In D14862#312653, @avg wrote:FWIW, rS253996 is what I did to translate INVARIANTS to DEBUG for ZFS.
Seems like the commit even touched some dtrace modules.
Update the diff based on comments from @markj.