netinet tests: Enable parallelization of forwarding tests
AbandonedPublic
Actions

Authored by markj on Apr 4 2023, 8:52 PM.

Details

Reviewers

kp
melifaro

Summary

Each of these tests creates a vnet jail and an epair, and assigns an IP
address to the host side of the epair. They cannot be run in
parallel due to IP address conflicts.

Enable parallelization by creating a second jail per test and putting
the other end of the epair in the other jail. This does not make the
tests significantly more complicated.

We have other tests which have the same problem. I'm not sure whether
such a solution is appropriate there since it would involve a lot of
churn. This revision is meant to garner feedback and see what folks
think.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 50762
Build 47653: arc lint + arc unit

Event Timeline

markj created this revision.Apr 4 2023, 8:52 PM

Herald added subscribers: glebius, ae, asomers, imp. · View Herald TranscriptApr 4 2023, 8:52 PM

markj requested review of this revision.Apr 4 2023, 8:52 PM

Harbormaster completed remote builds in B50762: Diff 119857.Apr 4 2023, 8:52 PM

I’ll rewrite those two in python where we have an easy auto-vnet.
Personally I’d love to see us reducing the amount of shell tests in the scenarios like this..

In D39420#897295, @melifaro wrote:

I’ll rewrite those two in python where we have an easy auto-vnet.
Personally I’d love to see us reducing the amount of shell tests in the scenarios like this..

So I should drop this diff and just mark them as is_exclusive for now? That's fine if so.

In D39420#897297, @markj wrote:

In D39420#897295, @melifaro wrote:

I’ll rewrite those two in python where we have an easy auto-vnet.
Personally I’d love to see us reducing the amount of shell tests in the scenarios like this..

So I should drop this diff and just mark them as is_exclusive for now? That's fine if so.

That said, I'm not really convinced that rewriting everything in python is a good approach. Certainly new tests can be written that way, but rewriting is a lot of work for little gain (and also has the disadvantage of potentially introducing bugs). The net/netinet/netinet6 tests have a lot of shell scripts. In general, I'd prefer to keep existing tests and mark them as deprecated somehow. If they run reasonably quickly, there is little harm in keeping them indefinitely.

In D39420#897303, @markj wrote:

That said, I'm not really convinced that rewriting everything in python is a good approach. Certainly new tests can be written that way, but rewriting is a lot of work for little gain (and also has the disadvantage of potentially introducing bugs). The net/netinet/netinet6 tests have a lot of shell scripts. In general, I'd prefer to keep existing tests and mark them as deprecated somehow. If they run reasonably quickly, there is little harm in keeping them indefinitely.

My gut feeling is that shell tests are most appropriate for cases like this, where the logic is very, very simple. All we're doing is building very static configurations, using the tools users would also use for that task.

I need to look at some examples of the python tests (which I hope to get to when I'm home again, so in the next few days) to get a better informed opinion.

I'm not opposed to adding a requirement that tests do not add ip addresses or interfaces to the host, but instead to everything in a jail. It has a relatively small impact on the test complexity, and would indeed enable parallel use (once we also de-duplicate the jail names).
We should find a good place to document that requirement though (along with other policies, such as when/if to load kernel modules and change global sysctls).

In D39420#897335, @kp wrote:

In D39420#897303, @markj wrote:

That said, I'm not really convinced that rewriting everything in python is a good approach. Certainly new tests can be written that way, but rewriting is a lot of work for little gain (and also has the disadvantage of potentially introducing bugs). The net/netinet/netinet6 tests have a lot of shell scripts. In general, I'd prefer to keep existing tests and mark them as deprecated somehow. If they run reasonably quickly, there is little harm in keeping them indefinitely.

My gut feeling is that shell tests are most appropriate for cases like this, where the logic is very, very simple. All we're doing is building very static configurations, using the tools users would also use for that task.

I tend to agree, but I also haven't spent much time trying to write tests in python.

I need to look at some examples of the python tests (which I hope to get to when I'm home again, so in the next few days) to get a better informed opinion.

I'm not opposed to adding a requirement that tests do not add ip addresses or interfaces to the host, but instead to everything in a jail. It has a relatively small impact on the test complexity, and would indeed enable parallel use (once we also de-duplicate the jail names).

We should find a good place to document that requirement though (along with other policies, such as when/if to load kernel modules and change global sysctls).

There is tests(7), perhaps we should add a section there aimed at test authors? I could take a stab at that.

Regarding policies, my current half-baked idea is that we should repurpose the "ci" test variable to mean, "do whatever you want to the system." For CI purposes this simplifies things somewhat; one wouldn't have to explicitly configure a test VM to load a laundry list of modules. Similarly, there should be no need to explicitly enable KTLS or unsafe AIO to avoid skipping useful tests. Though, one still has to know which third-party software to install, so some configuration would still be needed. I don't see a good way around that, but we should try to minimize the effort needed to stand up a CI instance.

In D39420#897335, @kp wrote:

In D39420#897303, @markj wrote:

That said, I'm not really convinced that rewriting everything in python is a good approach. Certainly new tests can be written that way, but rewriting is a lot of work for little gain (and also has the disadvantage of potentially introducing bugs). The net/netinet/netinet6 tests have a lot of shell scripts. In general, I'd prefer to keep existing tests and mark them as deprecated somehow. If they run reasonably quickly, there is little harm in keeping them indefinitely.

My gut feeling is that shell tests are most appropriate for cases like this, where the logic is very, very simple. All we're doing is building very static configurations, using the tools users would also use for that task.

I agree, the logic here is simple. The implementation, however, contains quite a lot of boiler plate code and ends up calling python for crafting and receiving the packets. What I'm saying is that we should have a framework that allows to write the business logic in an easy fashion, without doing a lot of such boiler plate stuff.

I need to look at some examples of the python tests (which I hope to get to when I'm home again, so in the next few days) to get a better informed opinion.

I've created D39445 with the rewrite so it's easier to reason about the approaches.

I'm not opposed to adding a requirement that tests do not add ip addresses or interfaces to the host, but instead to everything in a jail. It has a relatively small impact on the test complexity, and would indeed enable parallel use (once we also de-duplicate the jail names).

I'd second that. The only thing we need to ensure is that there is a convenient framework/API to do so. For example, there's no such API for C.

We should find a good place to document that requirement though (along with other policies, such as when/if to load kernel modules and change global sysctls).

I guess we can do it in test(7), atf examples and wiki. It'll also probably be good to add some mark to the tests not followin this approach, so the folks don't copy those to write the new ones.

In D39420#897336, @markj wrote:

In D39420#897335, @kp wrote:

In D39420#897303, @markj wrote:

That said, I'm not really convinced that rewriting everything in python is a good approach. Certainly new tests can be written that way, but rewriting is a lot of work for little gain (and also has the disadvantage of potentially introducing bugs). The net/netinet/netinet6 tests have a lot of shell scripts. In general, I'd prefer to keep existing tests and mark them as deprecated somehow. If they run reasonably quickly, there is little harm in keeping them indefinitely.

My gut feeling is that shell tests are most appropriate for cases like this, where the logic is very, very simple. All we're doing is building very static configurations, using the tools users would also use for that task.

I tend to agree, but I also haven't spent much time trying to write tests in python.

I need to look at some examples of the python tests (which I hope to get to when I'm home again, so in the next few days) to get a better informed opinion.

I'm not opposed to adding a requirement that tests do not add ip addresses or interfaces to the host, but instead to everything in a jail. It has a relatively small impact on the test complexity, and would indeed enable parallel use (once we also de-duplicate the jail names).

We should find a good place to document that requirement though (along with other policies, such as when/if to load kernel modules and change global sysctls).

There is tests(7), perhaps we should add a section there aimed at test authors? I could take a stab at that.

Regarding policies, my current half-baked idea is that we should repurpose the "ci" test variable to mean, "do whatever you want to the system." For CI purposes this simplifies things somewhat; one wouldn't have to explicitly configure a test VM to load a laundry list of modules. Similarly, there should be no need to explicitly enable KTLS or unsafe AIO to avoid skipping useful tests. Though, one still has to know which third-party software to install, so some configuration would still be needed. I don't see a good way around that, but we should try to minimize the effort needed to stand up a CI instance.

In D39420#897303, @markj wrote:

In D39420#897297, @markj wrote:

In D39420#897295, @melifaro wrote:

I’ll rewrite those two in python where we have an easy auto-vnet.
Personally I’d love to see us reducing the amount of shell tests in the scenarios like this..

So I should drop this diff and just mark them as is_exclusive for now? That's fine if so.

That said, I'm not really convinced that rewriting everything in python is a good approach. Certainly new tests can be written that way, but rewriting is a lot of work for little gain (and also has the disadvantage of potentially introducing bugs). The net/netinet/netinet6 tests have a lot of shell scripts. In general, I'd prefer to keep existing tests and mark them as deprecated somehow. If they run reasonably quickly, there is little harm in keeping them indefinitely.

I dont' think that we should rewrite the existing tests in Python either - it serves little business value, I'm talking about the new tests.

In D39420#897336, @markj wrote:

In D39420#897335, @kp wrote:

In D39420#897303, @markj wrote:

That said, I'm not really convinced that rewriting everything in python is a good approach. Certainly new tests can be written that way, but rewriting is a lot of work for little gain (and also has the disadvantage of potentially introducing bugs). The net/netinet/netinet6 tests have a lot of shell scripts. In general, I'd prefer to keep existing tests and mark them as deprecated somehow. If they run reasonably quickly, there is little harm in keeping them indefinitely.

My gut feeling is that shell tests are most appropriate for cases like this, where the logic is very, very simple. All we're doing is building very static configurations, using the tools users would also use for that task.

I tend to agree, but I also haven't spent much time trying to write tests in python.

I've created D39445 with the rewrite so it can make it easier to relate.

I need to look at some examples of the python tests (which I hope to get to when I'm home again, so in the next few days) to get a better informed opinion.

I'm not opposed to adding a requirement that tests do not add ip addresses or interfaces to the host, but instead to everything in a jail. It has a relatively small impact on the test complexity, and would indeed enable parallel use (once we also de-duplicate the jail names).

We should find a good place to document that requirement though (along with other policies, such as when/if to load kernel modules and change global sysctls).

There is tests(7), perhaps we should add a section there aimed at test authors? I could take a stab at that.

Regarding policies, my current half-baked idea is that we should repurpose the "ci" test variable to mean, "do whatever you want to the system." For CI purposes this simplifies things somewhat; one wouldn't have to explicitly configure a test VM to load a laundry list of modules. Similarly, there should be no need to explicitly enable KTLS or unsafe AIO to avoid skipping useful tests. Though, one still has to know which third-party software to install, so some configuration would still be needed. I don't see a good way around that, but we should try to minimize the effort needed to stand up a CI instance.

I agree, checking if the kernel has the functionality and loading modules that don't have the side effects should be part of the testing framework. Currently, pytest infrastructure does some of that (and there are plans to do automated INET/INET6 checks for skipping the non-relevant tests).
Re ci - agree (or have a new one, depending on the number of the tests).

jlduran mentioned this in D39445: tests: convert forwarding tests to python.Apr 6 2023, 1:55 PM

I won't commit this. Probably we can use these tests as a starting point for integrating D42350.