Page MenuHomeFreeBSD

debugnet(4): Add optional full-duplex mode
ClosedPublic

Authored by cem on Sep 6 2019, 12:53 AM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Dec 17, 1:54 AM
Unknown Object (File)
Fri, Dec 6, 6:04 PM
Unknown Object (File)
Thu, Dec 5, 11:35 PM
Unknown Object (File)
Nov 16 2024, 7:19 PM
Unknown Object (File)
Oct 24 2024, 2:54 AM
Unknown Object (File)
Oct 3 2024, 9:31 AM
Unknown Object (File)
Oct 2 2024, 3:43 PM
Unknown Object (File)
Sep 19 2024, 5:42 PM
Subscribers

Details

Summary

It remains unattached to any client protocol. Netdump is unaffected
(remaining half-duplex). The intended consumer is NetGDB.

Test Plan

Confirmed netdump client continues to work correctly with stock netdumpd from ports.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 26610
Build 24989: arc lint + arc unit

Event Timeline

sys/net/debugnet.c
354

Why is "netdump" in the name? Isn't this routine for full-duplex protocols?

376

Why not const struct debugnet_msg_hdr *?

388

One thing to be careful about here is the sorcerer's apprentice problem: if the received packet was delayed enough that the server retransmitted, here I think we'll ACK both the original packet and the retransmit. Without knowing what the netgdb proxy does with an ACK, I'm not sure if the duplicate ACKs will actually cause problems.

sys/net/debugnet_inet.c
206 ↗(On Diff #61710)

Can you explain why this fragment is needed? Is it related to the rest of the diff?

sys/net/debugnet.c
354

I honestly have no idea. Brain-o.

376

I don't like typing long typenames twice (it's already in the declaration), and const void* is equally valid given mtod just blindly casts.

Honestly, mtod is useless. This seems better to me: https://godbolt.org/z/lUXC9v

388

This race can happen in any protocol with Acks, no? I think the sender must always handle duplicate acks (and the in-progress version of the proxy handles duplicate acks already).

sys/net/debugnet_inet.c
206 ↗(On Diff #61710)

It's orthogonal and could be extracted into an independent change.

sys/net/debugnet.c
388

Sure, but how exactly does the proxy handle them? The problem would arise if the proxy sent out a "next" packet as a response to an ACK, and sent out multiple copies of the packet in response to duplicate ACKs.

sys/net/debugnet.c
376

I guess it's more helpful to the reader in some of the massive network input/output routines we have, where looking at the declaration would involve scrolling back hundreds of lines.

sys/net/debugnet.c
376

I guess. When I read code with an off-screen declaration block, I tend to look at how the variables are used to establish type, if I don't know off-hand. Generally the reason to mtod() something is to access it, which means either a direct struct dereference or passing the pointer to a function. Ctags works well for either of those, if they happen near the mtod.

That said, I'm definitely supportive of refactoring the massive network routines we have down to more bite-sized pieces independent of all that.

388

So big caveat, the following is just describing the as-is state of the nonfunctional C proxy; I might be missing something:

The problem would arise if the proxy sent out a "next" packet as a response to an ACK

The as-is proxy doesn't anything conditional on recieving TX acks today.

and sent out multiple copies of the packet in response to duplicate ACKs.

When the proxy initially sends a message to netgdb, it stores a copy of the TX message in an "unacked TX" list. On first ack with matching seqno, it just frees the buffer from the list. An Ack that doesn't match an on-list buffer is ignored. (A to-be-implemented resend asynchronous timer periodically resends anything on the unacked list, like debugnet/netdump today.)

These checks can be done fairly cheaply even with a linked list because we can track a txseqno below which all packets have been acked. Also, the list is inherently sorted by seqno; some combination of low packet loss, low packet reordering, and limited list size means we're unlikely to spend a lot of time traversing the list in practice. [Though: unlike the kernel debugnet, I did not place any upper limit on the number of unacked TX messages.]

cem marked 4 inline comments as done.
  • Split out orthogonal change
  • Drop netdump from function name having nothing to do with netdump

Ok, I think the first round of comments is addressed.

sys/net/debugnet.c
388

I'm marking this thread "done" on the basis of I (believe that I) understand the concern and need for care in the other conversant (i.e., the gdb proxy), and to the best of my understanding, there is nothing actionable about it from this end of the protocol.

sys/net/debugnet_inet.c
206 ↗(On Diff #61710)

Extracted to D21745

sys/net/debugnet.c
388

I had written the following comment but neglected to send it, sorry. "That seems reasonable to me: the main concern is that an agent receiving duplicate ACKs should not perform a duplicate action (like sending out multiple copies of the same packet in response), and it sounds like you're handling that properly."

sys/net/debugnet.c
388

Ah, glad we're on the same page.

This is accounted for here in the current version of the Python proxy: https://reviews.freebsd.org/P314$331-336 . (Those line number links may break slightly as I update the Python proxy.)

I forget how familiar you are with Python, but if you aren't (or for other readers), debugnet_unacked_tx is a dictionary/map/hashtable from sequence numbers to sent messages; "not in" is a check for the absence of the received ACK seqno, and "del foo[bar]" removes the K-V pair associated with key bar from the dictionary/map/hashtable foo.

This revision was not accepted when it landed; it landed in state Needs Review.Oct 17 2019, 8:25 PM
This revision was automatically updated to reflect the committed changes.