Details

Reviewers

cc
tuexen
zlei
jhb
melifaro
glebius
kp
pauamma_gundo.com
eugen_grosbein.net

Group Reviewers

manpages

Commits

rG31cf66d7554c: dummynet: add simple gilbert-elliott channel model

Summary

Building a good analog of correlated loss behavior
across realistic environments in dummynet was cumbersome.

Introducing state in the flow-set and 4 probabilities to
switch between the two states, with two distinct loss
probabilities provides for a simple Gilbert-Elliott
channel model. This streamlines the testing of burst-loss
environments.

Test Plan

ipfw pipe 100 config plr 0.001
ipfw pipe 200 config plr 0.0,0.01,0.1,0.05

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

rscheff created this revision.Dec 8 2023, 11:16 PM

Herald added subscribers: donner, ae, imp. · View Herald TranscriptDec 8 2023, 11:16 PM

rscheff requested review of this revision.Dec 8 2023, 11:16 PM

Harbormaster completed remote builds in B54875: Diff 131196.Dec 8 2023, 11:16 PM

print full probabilies also when G-loss is 0

Harbormaster completed remote builds in B54877: Diff 131198.Dec 8 2023, 11:23 PM

pauamma_gundo.com added inline comments.Dec 9 2023, 3:21 AM

sbin/ipfw/ipfw.8
3084	Worth being explicit that these arguments are comma-separated and mention their internal representation if relevant?
3093	Or explain what A is.

pauamma_gundo.com added a reviewer: manpages.Dec 9 2023, 3:21 AM

Minor style quibbles.

I'd also love to see a test case, even a basic one that just activates the packet-loss-rate code and sends a few ping packets through provides some sanity checking. (My view is that while it might not be worth the effort to validate the bimodal loss rates here it is worth just running the code, because that has a tendency to provoke lock issues, or leaks or or or.)

sys/netpfil/ipfw/ip_dn_io.c
503	I'm really not a fan of having the case and first line of code on the same line. I think style(9) also disagrees with it, and I don't see any examples of it in the ipfw/dummynet code either.
505	This might be clarified a little by having an enum for the states. Something like f->pf_state = PL_STATE_BAD perhaps?

This revision is now accepted and ready to land.Dec 9 2023, 10:56 AM

extend man page with gilbert-elliott model description

This revision now requires review to proceed.Dec 9 2023, 11:56 AM

Harbormaster completed remote builds in B54880: Diff 131211.Dec 9 2023, 11:56 AM

rscheff marked 2 inline comments as done.Dec 9 2023, 12:00 PM

rscheff added inline comments.

sbin/ipfw/ipfw.8
3084	Ideally, I wanted to have the parameters in loss probabilities; the literature of the gilbert-elliott model give these (k, h) as transmission probabilities - the inverse of what the simple PLR loss probability is; Adjusted the code accordingly. Alternatively, I could call these parameters K and H, with K = (1-k) and H = (1-h), keep the code more streamlined and the probabilities would align nicely with the simple PLR loss model... Any opinions?

enum the states

Harbormaster completed remote builds in B54881: Diff 131214.Dec 9 2023, 12:07 PM

rscheff marked 2 inline comments as done.Dec 9 2023, 12:07 PM

enum the states

Harbormaster completed remote builds in B54882: Diff 131215.Dec 9 2023, 12:11 PM

In D42980#980005, @kp wrote:

I'd also love to see a test case, even a basic one that just activates the packet-loss-rate code and sends a few ping packets through provides some sanity checking. (My view is that while it might not be worth the effort to validate the bimodal loss rates here it is worth just running the code, because that has a tendency to provoke lock issues, or leaks or or or.)

It doesn't seem that any test code exist currently, to validate any of the dummynet functionality though... And testing a stochastic / probability process is more involved.

But what I need this code for is to collect statistical relevant flow-completion times between the base TCP stack (A test), and an enhancement (not discarding SACK data after an RTO) of the TCP stack (B test). In order to elicit the RTO - loss of a retransmission - there needs a quite significant loss probablity during such a loss burst, but also the TCP congestion window has to have grown sufficiently - thus a simple loss probability is not good enough to statistically test this in a short enough test campaign...

The idea was to place a gilbert-elliott model pipe on the lo0 interface, and have a tool like uperf transfer 10MB for 10000-100000 times, logging the completion times (and maybe other statistics) for each run...

Manual page change LGTM. I can't speak to consistency with code.

sbin/ipfw/ipfw.8
3084	I have no basis for an informed opinion on that point myself.

This revision is now accepted and ready to land.Dec 9 2023, 2:54 PM

zlei added inline comments.Dec 12 2023, 6:57 AM

sys/netpfil/ipfw/ip_dn_io.c
506	This `default: /* FALLTHROUGH */` looks weird to me. Do we have any other possible pl_state ( in future ) ?

put code in default branch and fall through on specific case

This revision now requires review to proceed.Dec 12 2023, 9:35 AM

Harbormaster completed remote builds in B54901: Diff 131272.Dec 12 2023, 9:35 AM

guest-patmaddox added a subscriber: guest-patmaddox.Dec 13 2023, 10:02 PM

Discussed this in the transport call. Will change the probabilities in the Gilbert model from transmission probability back to drop probabiliy for consistency within the tool, and document how to map the literature variables k and h to what the tool accepts.

sys/netpfil/ipfw/ip_dn_io.c
506	It would not be inconceiveable to extend this code from a simple Gilbert-Elliott (2 state, 4 probabilities) model, to a full multi-state markov chain (3 probabilities per state - loss prob, prob to move state forward, prob to move state backwards).

make gilbert model with loss prob, document in man page

Harbormaster completed remote builds in B54951: Diff 131441.Dec 14 2023, 4:06 PM

tuexen accepted this revision.Dec 14 2023, 4:10 PM

This revision is now accepted and ready to land.Dec 14 2023, 4:10 PM

In D42980#980052, @rscheff wrote:

In D42980#980005, @kp wrote:

I'd also love to see a test case, even a basic one that just activates the packet-loss-rate code and sends a few ping packets through provides some sanity checking. (My view is that while it might not be worth the effort to validate the bimodal loss rates here it is worth just running the code, because that has a tendency to provoke lock issues, or leaks or or or.)

It doesn't seem that any test code exist currently, to validate any of the dummynet functionality though... And testing a stochastic / probability process is more involved.

We do already have some dummynet tests in https://cgit.freebsd.org/src/tree/tests/sys/netpfil/common/dummynet.sh
Those are very high-level tests, and don't validate the statistical correctness of what dummynet does, but they do serve to detect things like lock order or cleanup issues.

Manual page English still LGTM.

add kyua test cases for dummynet pls

This revision now requires review to proceed.Dec 16 2023, 1:40 PM

Harbormaster completed remote builds in B54975: Diff 131491.Dec 16 2023, 1:40 PM

add allow-all rule prior to sanity check, validate an approximate percentage of pings get dropped

Harbormaster completed remote builds in B54978: Diff 131496.Dec 16 2023, 9:24 PM

rscheff mentioned this in D43078: netpfil tests: do not run ipfw tests if net.inet.ip.fw.default_to_accept is not set.Dec 17 2023, 10:04 AM

kp added inline comments.Dec 17 2023, 4:41 PM

tests/sys/netpfil/common/dummynet.sh
600	We're going to send a ping every 100ms (because -i .1), and keep running that for 60 seconds (so 6x), with a 10 second interval (so we lose a few iteration). I don't think we need the `:10` here. We probably want to do `ping -i .01`, and drop the `:10`. Ideally we'd want to extract the actual loss rate into a variable and do math on that, but that's not exactly straightforward with `ping`.

speed up test case from 60sec down to a maximum of 6sec, if initial check fails

Harbormaster completed remote builds in B54984: Diff 131512.Dec 17 2023, 7:33 PM

rscheff marked an inline comment as done.Dec 17 2023, 7:33 PM

rscheff added inline comments.

tests/sys/netpfil/common/dummynet.sh
600	Actually, looking at https://github.com/freebsd/atf/commit/d7c7c53c0626ab59a62aa4efcf05323b3621baa9 the -r option does seem to repeatedly call whatever test function / binary when the check fails - until the timeout expires or the check passes; and waits :<n> milliseconds between two consecutive calls. the ping process with -c 100 -i .1 would take 10 sec, so this should call the pinger up to 5 or 6 times. And yes, I'll reduce the ping time down to 10ms (-i 0.010) for a faster execution time, and slice the -r down to 6:10 at the atf_check. Since there is no point waiting 50ms between executions of the ping process, I'll keep that at 10ms. Since these are probabilistic losses, I think giving a reasonable confidence interval, how many good pings to expect should be good enough.

kp accepted this revision.Dec 17 2023, 8:26 PM

kp added inline comments.

tests/sys/netpfil/common/dummynet.sh
600	Since there is no point waiting 50ms between executions of the ping process, I'll keep that at 10ms. I did not read the man page with sufficient attention, and thought :10 would imply a 10 second (rather than the correct 10 milliseconds) wait between invocations. So yes, that's worth keeping as you had it.

This revision is now accepted and ready to land.Dec 17 2023, 8:26 PM

Closed by commit rG31cf66d7554c: dummynet: add simple gilbert-elliott channel model (authored by rscheff). · Explain WhyDec 17 2023, 9:57 PM

This revision was automatically updated to reflect the committed changes.

rscheff marked an inline comment as done.

rscheff added a commit: rG31cf66d7554c: dummynet: add simple gilbert-elliott channel model.

dummynet: add simple gilbert-elliott channel model
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 131513

sbin/ipfw/dummynet.c

sbin/ipfw/ipfw.8

sys/netinet/ip_dummynet.h

sys/netpfil/ipfw/ip_dn_glue.c

sys/netpfil/ipfw/ip_dn_io.c

sys/netpfil/ipfw/ip_dn_private.h

tests/sys/netpfil/common/dummynet.sh

dummynet: add simple gilbert-elliott channel modelClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 131513

sbin/ipfw/dummynet.c

sbin/ipfw/ipfw.8

sys/netinet/ip_dummynet.h

sys/netpfil/ipfw/ip_dn_glue.c

sys/netpfil/ipfw/ip_dn_io.c

sys/netpfil/ipfw/ip_dn_private.h

tests/sys/netpfil/common/dummynet.sh

dummynet: add simple gilbert-elliott channel model
ClosedPublic
Actions

Revision Contents
Changeset List