Overview
This is a followup to the kyua jail support patch, https://reviews.freebsd.org/D42350:
- 1% of this patch assumes that everyone wants faster test suite runs and jail is always built-in -- thus, it forces all tests to have execenv="jail" by default. This part, I guess, needs discussion and opinions from experienced ones who have been working with the test suite for a long time and have intuition of its possible real life use cases and nuances. For the demo purpose jail execution environment is turned on in this patch by direct change in the suite.test.mk, it's prefixed to allow tests overriding it.
- 99% of this patch is not related to the decision whether "run in jail" should be by default. It's about existing tests, to help some of them have required jail configuration to do well if they are asked to run in a jail, or set them back to host execution environment.
The way it's done
I've done the first run of the whole suite with execenv="jail" set for everyone, and iterated through the skipped, broken, and failed ones. My instructions were as follows:
- if a test has an issue with kernel module loading in a jail -- add respective execenv_jail params (if_epair is a frequent example)
- if a test needs additional rights or features in a jail -- add respective execenv_jail params (e.g. vnet, allow.mlock)
- if a test expects a user to load a module -- keep it as is
- if a test show instability in a jail (panics, fails) -- set it to always ask for execenv="host", i.e. it won't be run in a jail by kyua
- if a test obviously cannot run in a jail -- set it to execenv="host"
- else -- set it to execenv="host"
ZFS tests are all set to execenv="host", it's done at a single place,atf.test.mk, and it's based on the fact that all those tests are the only ones who require ksh93. That's to decrease time investment due to ZFS tests have a lot of Makefiles, and it's better to hear other opinions first before dealing with all of them.
Results
Such way I've managed to get more tests moved out from skipped/broken/failed categories with "run in jail" activation. Anyway, some fails are left but it seems to be usual fluctuations according to the CI and it's just open topics waiting for fixing/tuning.
RFC
This patch could be a starting point after the kyua patch, if the latter gets a green light. And I believe that some tests can be improved in future to move out from host execenv and be able to run in parallel with jails.
Demo
I guess, all these say nothing without some numbers to compare. I've managed to collect the following.
The baseline:
- 8 cpus
- all builds were based on CURRENT 314542de6d (Oct 27)
- pkg install python py39-pip py39-pytest jq perl5 openvpn ksh93 gtar isc-dhcp44-server
- pip install scapy
- kldload zfs pf pfsync pflog dummynet if_bridge if_ovpn ipdivert sctp carp ipsec tcpmd5 if_wg cryptodev if_stf
- sysctl kern.crypto.allow_soft=1
- sysctl kern.ipc.tls.enable=1
- /usr/tests/sys/cddl/zfs tests were excluded from the runs
AArch64
1 h 52 min -- no patches applied, non-parallel
8168/8216 passed (48 failed) Test cases: 8216 total, 265 skipped, 35 expected failures, 1 broken, 47 failed 6730.37 real 394.08 user 747.83 sys
1 h 25 min -- no patches applied, parallelism=8
8131/8216 passed (85 failed) Test cases: 8216 total, 277 skipped, 35 expected failures, 7 broken, 78 failed 5099.31 real 699.89 user 1164.26 sys
0 h 36 min -- kyua and this patch applied, parallelism=8
8164/8216 passed (52 failed) Test cases: 8216 total, 262 skipped, 35 expected failures, 4 broken, 48 failed 2155.09 real 473.59 user 1040.41 sys
AMD64
2 h 6 min -- no patches applied, parallelism=8 (had to exclude a few tests due to constant panics)
8129/8204 passed (75 failed) Test cases: 8204 total, 266 skipped, 35 expected failures, 5 broken, 70 failed 7556.41 real 4971.02 user 7299.63 sys
1 h 5 min -- kyua and this patch applied, parallelism=8
8172/8224 passed (52 failed) Test cases: 8224 total, 253 skipped, 35 expected failures, 4 broken, 48 failed 3921.02 real 3637.01 user 4967.95 sys