Page MenuHomeFreeBSD

tcp: use declarative style to fill struct tcp_log_buffer
Needs ReviewPublic

Authored by glebius on Apr 8 2025, 5:59 AM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Jul 29, 7:25 AM
Unknown Object (File)
Thu, Jul 10, 8:26 AM
Unknown Object (File)
Sun, Jul 6, 5:11 PM
Unknown Object (File)
Sat, Jul 5, 1:09 AM
Unknown Object (File)
Jun 25 2025, 4:28 AM
Unknown Object (File)
Jun 22 2025, 1:36 AM
Unknown Object (File)
Jun 21 2025, 7:17 AM
Unknown Object (File)
Jun 20 2025, 2:48 PM
Subscribers

Details

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 63383
Build 60267: arc lint + arc unit

Event Timeline

Here is suggested change I described on the recent call. I like the declarative style of filling in the info for reasons of code readability and maintenance. And I was sure that a compiler would be smart enough to optimize it. However, analyzing the assembly produced with -O2 I see that the tlb->tlb_opts are always zero-filled regardless if the following memcpy() is done or not. Also, the overall size of the function in assembly grows with this change, mostly because now compiler would allocate size of struct tcp_log_buffer on the stack, then populate it with useful data, and then copy this data from stack to the memory pointed by log_entry. The copy is done as a long queue of movs, without neither rep mov nor a call to memcpy(9). This seems to happen due to stack memory not exactly representing the structure.

I have no idea how this is going to be performing under benchmarks. Speculatively should be slower, as more instructions and memory copying is done. However, unlike the unmodified version, this assembly would never touch memory returned from uma_zalloc() if it fails for any case. This seems like a nice feature to avoid cache poisoning, but for a general function that can fail sometimes. This particular function can fail only with a panic(9).

So I am unsure if you like the change or not and should we go forward with it. My expectations of the compiler were different.

To me this is marginally less readable in the changed form. Both are much better than the original but considering your comments about what the compiler actually does I suggest we leave it the way it is!

First of all I would like the thank you for providing running code for discussing this change. This allows at least me to have a discussion based on concrete code.

I agree with Randall, that I do not think this code is clearer or easier to read for humans. I was thinking about the reason why this is my impression.
Here are some points which came up:

  • this code replicates the code of tcp_fields_to_net().
  • I am used to handle flags like tlb_eventflags in away, where you or a flag in and do the corresponding action in consecutive lines of code, since these belong together logically. Looking at the code this patch proposes stets all the flags in one place and, when I want to double check the setting the flags corresponds to the correct action requires scrolling.
  • Conceptually, in my mind, initializing a variable is one step. The code for this one step is 63 lines long. For me, this code is harder to read than the one it is replacing.
  • This code uses the declarative style for all fields, except for tlb_opts. For me this inconsistency is irrelevant.

But this is just my view, only focussing on the source code. So I am happy to consider arguments which are in favor of accepting the code. Focussing on the generated machine code, I understand your statement, that the code is not better than the one generated for the code you propose to replace.

I do see one benefit of the suggested change: if we add additional fields to struct tcp_log_buffer, they will be automatically initialized to 0 by the suggested code. Then there would have been no reason for my earlier fix (initializing tlb_flex1 and tlb_flex2).

Given that code generated is questionable I won't insist on this change at all. However, I'd like to list the pros of such style of filling structures.

  • It is a single statement, so compiler is allowed to do any optimizations inside it, including any useful reordering of memory writes.
  • It is a single statement, so when new person reads the code, they can skip all 60 lines knowing that nothing except filling tlb happens here, no side effects.
  • Reduced maintenance overhead when adding/removing fields to the structure.

this code replicates the code of tcp_fields_to_net().

This can be avoided by providing a static inline that would return a struct tcp instead of a pointer. It could be tcp_fields_to_net() itself, since this is the only place that uses it. So will look like:

.tlb_th = th_hostorder ? hton_tcphdr(*th) : *th,

Thanks for providing your view. It helps to understand why you are proposing it.
It should be noted, that the old code does not initialize all fields, when they are not used. I guess this was done intentionally due to performance considerations. This does not work with the style suggested. But I don't know how relevant this actually is.