.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jan 22 2023
In D38053#865570, @jlduran_gmail.com wrote:Thank you! I'll wait for @asomers comments to update the revision.
I'll document here my small - not really important - wishlist:
- Optional randomization of the test order, maybe something similar to pytest-random-order.
Nice one, can probably work out-of-the box, but the scope will be a single file, due to the way of ATF<>pytest interaction is implemented.
- When debugging atf-python tests, the output could be a little bit less busy, maybe pytest_terminal_summary et al. instead of print() could be one answer.
I get the intent and I agree. Maybe you could come up with an example (or even a diff) on how do you see it?
- An atf_get_srcdir-equivalent to eventually read files relative to the source directory, however I think the idea is to disassociate from ATF later on.
This one slipped through the cracks, thanks for reminding! Will add in a day or two.
- In the meantime the kernel delayed object reclamation model issue is fixed, maybe adapt 80fc25025ffcb0d369fc0b6d4d272ad6fd3f53c3 ?
Yep, been on my list for quite some time, I'll do the change in a day or two.
In D37971#866085, @ngie wrote:In D37971#863634, @melifaro wrote:In D37971#863583, @ngie wrote:I hate to be a wet blanket, but a lot of the code seems to go against design decisions made in pytest around fixtures, extendability, etc.
First of all, thank you for the feedback.
Some of the things intentionally or unintentionally are done in ATF way, which does not match pytest. If you could share a bit more detailed feedback, I'd try to make the framework changes to take it closer to the pytest.In particular, just glancing at this commit, it seems to rely on J-Unit-like structure (used in unittest), which is not strictly adhered to in pytest.
Currently, the primary use case of python integration is development of the tests with the relatively complex setup/teardown procedures (e.g. custom vnet-based topologies). That is the part I feel is the least covered by the shell or c-based tests. Any suggestions on doing it in a more pytest way?
FWIW, I honestly think integration should be the other way around: ATF should integrate into pytest, not pytest should integrate into ATF. If things were done in that manner and we used the JUnit output format (supported natively in pytest), we could move away from Kyua to a framework that is less bespoke, has a ton less boilerplate than ATF, has better developer and user experience, and has better opensource mindshare than ATF/Kyua.
The main value ATF/Kyua provides (IMHO) is the ability to integrate in tests from NetBSD and a format to express legacy tests in, which gave FreeBSD a great head start in terms of CI/testability. Other than that, it's kind of a kludgy framework.I agree with the ATF/Kyua assessment - I don't like the API or (lack of) functionality either. I agree that pytest can be a good candidate for the core testing framework, provided there is a consensus on that and the migration/support engineering resources are secured. I don't think we're here yet. Currently, shell and C test files correspond to around 90% of the all test files, with similar figures across kernel and userland. Out of the remaining python files, only 10% represent pytest tests; the rest are wrapped with the shell scripts.
Personally, I think that the most beneficial resource application at the moment is improving the python support & reducing the bar to adding python tests to drive adoption. I'd also like to note that the current state of the things is not contrary to the "pytest instead of ATF" idea. It should be notably easier to convert the tests to the "original" pytest format once desired than migrating the shell script mess.What do you think?
P.S. the details of the percentage calculations are below. I know that some of the tests are third-party and not integrated in the framework, but It doesn't look that it change the results significantly.
- Userland+kernel test file stats: find . -type f -name \*.<c|sh|py> -ipath '*/tests/*' | wc -l
- Kernel test file stats: find tests/sys -type f -name \*.<c|sh|py> | wc -l
- Pytest file stats: find tests -name 'test*.py' | wc -l
Results:
kernel: 297(sh), 158 (c), 54(py)
kernel+user: 533(sh), 451(c), 76(py).
Pytest: 6 filespytest can run unittest expressed tests out of the box. Over time they could be migrated to a more pure pytest format (as needed), but at the very least one that use pytest for discovering, logging, etc.
I think the sh tests could be expressed using pytest as well. The C tests seem best suited for ATF and the C++ tests seem to be best suited for googletest.
If there is an implementation of ATF protocol for the test discovery/running for pytest, then both sh and C tests can be handled by the pytest.
I just revived some Macbooks that I thought were dead this past weekend and I have a hunch that I'll have a lot of dead time over the course of the week to develop for FreeBSD/OSS soon.
I'm going see what I can do to get back up to speed and help out with this effort.
It's worth to write something down to ensure we're on the same page.
What's the end state of the test suite you envision? Is it a combination of pytest/kyua/googletest?
If some of the tests (python, sh, potentially C) are moving away from kyua, what are the interfaces provided by the test runner? Are these interfaces to stay the same or be different? (The biggest difference between kyua and pytest, for example, is the explicit cleanup procedure, run in a separate process).
In D38126#866510, @kp wrote:In D38126#866296, @vegeta_tuxpowered.net wrote:In D38126#866273, @kp wrote:I'm not 100% committed to this approach, but at the very least I'd like to see examples of what the tests themselves end up looking like.
The use case for those changes is in D38129. I thought splitting the huge amount of changes in tests I've ended up with while developing the OpenBSD-like scrub syntax into separate reviews would be a good idea, but I think it backfired. Should we merge those to reviews into one?
That, along with the code changes themselves, are on my list, but I don't expect to get the serious time and concentration that'll require before March.
Would you be open to reviewing python re-implementation of some of the existing pf tests?
Jan 21 2023
Jan 20 2023
In D38126#866273, @kp wrote:There's certainly a lot of repetition in many of the pf (and other firewall) tests.
So far I've chosen to not do anything about that, because it makes each individual test case much easier to understand. Each test case fully describes the setup it operates in, and when someone tries to debug it (or understand it for any other reason) there's no need to go look at other files to work out what the setup actually is.
I'm not 100% committed to this approach, but at the very least I'd like to see examples of what the tests themselves end up looking like.
Sure!
https://github.com/freebsd/freebsd-src/blob/main/tests/sys/netlink/test_rtnl_ifaddr.py -- probably the most clear one
Thank you for working on improving the testing infra!
I'd suggest looking into the already-existing native python functionality: https://github.com/freebsd/freebsd-src/blob/main/tests/examples/test_examples.py#L86
It would be nice to use something better than a shell for complex test scenarios - that would simplify the reusability and reduce the amount of code.
What do you think?
Jan 17 2023
In D38098#865469, @markj wrote:In D38098#865466, @melifaro wrote:Thank you for working on that! It was on my list for quite some time. I was concerned a bit about performance & was thinking of measuring the different approaches. Anyway, let's make it safe first and work on improvements later.
One other approach might be to instead make nlmsg_reserve_object() and nlmsg_reserve_data() zero the buffer area. I suspect the overhead of zeroing is pretty close to negligible in either case though?
Yep. Re overhead - it depends. Full-view dump is hundreds of megabytes, mainly consisting of the attributes (which don't require zeroing). Anyway, it shouldn't be a _huge_ difference and can be addressed later.
Thank you for working on that! It was on my list for quite some time. I was concerned a bit about performance & was thinking of measuring the different approaches. Anyway, let's make it safe first and work on improvements later.
Thank you again for addressing the security issue.
Thank you!
So generally, it looks good to me, and I'm fine with committing the change. @asomers: what do you think?
Jan 16 2023
In D38053#865132, @jlduran_gmail.com wrote:Here are the last iteration changes:
- Use a dictionary for expectations — this keeps both ping and pinger tests inline. Comparing the expected with the actual subprocess.CompletedProcess was also not feasible. It also was suggested initially by @melifaro.
One thing I noticed is that atf-python tests are slower than atf-sh, this was somewhat expected, but when all tests run, the total time builds up. I believe the gains are really from the development perspective.
Q: what's the runtime? e.g. median time reported by kyua test?
There are some tests in net/routing, doing similar SingleVnet-isolated testing, written both in python and in C.
C version takes ~120ms per test, python version takes ~330ms. My take is that the runtime under 0.5 second is fine. It may also be not easy to decrease those 330ms, as pytest does a lot of preparation work, but the runner calls it for a single test.
I suspect that importing scapy.all may contribute to the delay here. Note that kyua needs to first run list for the tests (to determine the isolation details) and then actual test and cleanup procedures.
Importing scapy.all in the top of the file will cause all 3 cases to wait till scapy init. I'd probably try to check if it's possible to use scapy subset and/or try to load it only when needed.
In D38068#865080, @markj wrote:In D38068#865079, @melifaro wrote:In D38068#865044, @markj wrote:In D38068#865012, @melifaro wrote:LGTM.
It would be nice to add a test for the basic PF_KEY functionality (for example, add 1 SA entry and list it afterwards)Ok, I will give it a try. Though, this is indirectly tested already by the sys/netipsec tests' use of setkey, which is how I found the bug.
Ack, then it's not necessary. Did any of the tests explicitly failed?
Sort of. :) The problem was found by running tests in a kernel with KMSAN enabled: https://ci.freebsd.org/job/FreeBSD-main-amd64-KMSAN_test/
By default, a KMSAN report causes a kernel crash, which is what happens in this case. I'm going through test suite failures (i.e., crashes) at the moment and fixing bugs.
Got it. So if we already have an automated way to check for the padding issues shat should be enough.
In D38068#865044, @markj wrote:In D38068#865012, @melifaro wrote:LGTM.
It would be nice to add a test for the basic PF_KEY functionality (for example, add 1 SA entry and list it afterwards)Ok, I will give it a try. Though, this is indirectly tested already by the sys/netipsec tests' use of setkey, which is how I found the bug.
Ack, then it's not necessary. Did any of the tests explicitly failed?
Add tests.
LGTM.
It would be nice to add a test for the basic PF_KEY functionality (for example, add 1 SA entry and list it afterwards)
Jan 15 2023
Jan 14 2023
Thank you for addressing the comments!
I’d still prefer to have the ids embedded explicitly via pytest.param, but I don’t insist on doing it.
Conceptually LGTM, please see some comments on the structure