Page MenuHomeFreeBSD

new test: after destroying a jail, its vnet interfaces should be visible by host
Needs ReviewPublic

Authored by olivier on Jul 22 2022, 9:47 PM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 20 2023, 7:56 AM
Unknown Object (File)
Dec 13 2023, 3:54 PM
Unknown Object (File)
Oct 15 2023, 5:49 AM
Unknown Object (File)
Oct 15 2023, 5:49 AM
Unknown Object (File)
Oct 14 2023, 5:00 AM
Unknown Object (File)
Oct 11 2023, 8:19 AM
Unknown Object (File)
Jul 2 2023, 6:45 PM
Unknown Object (File)
Jun 27 2023, 6:59 AM

Details

Reviewers
zec
kp
ngie
melifaro
glebius
Group Reviewers
tests
Summary

Adding a regression test to check if vnet interface assigned to a jail is visible from the host once the jail destroyed.

cf https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=264981

Test Plan

There is a regression on -head only:

root@fbsd131:/usr/tests # uname -a
FreeBSD fbsd131 13.1-RELEASE FreeBSD 13.1-RELEASE releng/13.1-n250148-fc952ac2212 GENERIC amd64
root@fbsd131:/usr/tests # kyua debug usr.sbin/jail/jail_basic_test:remove
Executing command [ ifconfig lo0 inet 172.254.254.254/32 alias ]
Executing command [ jail -c name=removejail persist ip4.addr=172.254.254.254 ]
Executing command [ jexec removejail ifconfig lo0 ]
Executing command [ jail -R removejail ]
Executing command [ jls -d -j removejail ]
usr.sbin/jail/jail_basic_test:remove  ->  passed

And on head:

root@current:/usr/tests # uname -a
FreeBSD bigone 14.0-CURRENT FreeBSD 14.0-CURRENT #148 main-n256820-6a26c99f827: Wed Jul 20 00:50:17 CEST 2022     root@bigone:/usr/obj/usr/s
rc/amd64.amd64/sys/BBR amd64
root@current:/usr/tests # kyua debug usr.sbin/jail/jail_basic_test:remove
Executing command [ ifconfig lo0 inet 172.254.254.254/32 alias ]
Executing command [ jail -c name=removejail persist ip4.addr=172.254.254.254 ]
Executing command [ jexec removejail ifconfig lo0 ]
Executing command [ jail -R removejail ]
Executing command [ jls -d -j removejail ]
Fail: incorrect exit status: 0, expected: 1
stdout:
   JID  IP Address      Hostname                      Path
     1  172.254.254.254                               /

stderr:

usr.sbin/jail/jail_basic_test:remove  ->  failed: atf-check failed; see the output of the test for details

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

ngie requested changes to this revision.Jul 25 2022, 12:40 AM
ngie added a subscriber: ngie.
ngie added inline comments.
tests/sys/net/if_destroy_vnet.sh
1 ↗(On Diff #108417)

Silly question: does the $FreeBSD$ RCS tag matter after moving to git?

9 ↗(On Diff #108417)

Maybe this instead?

22 ↗(On Diff #108417)

sleep 1 is racy: please confirm that jexec jifdestroy and jail -R succeed.

This revision now requires changes to proceed.Jul 25 2022, 12:40 AM
tests/sys/net/if_destroy_vnet.sh
22 ↗(On Diff #108417)

sleep 1 is racy: please confirm that jexec jifdestroy and jail -R succeed.

Polling for jail shutdown would probably be wise too...

yes the sleep is racy, but I have no idea to fix it.
Without it, the test will always fails: There is a delay between the jail destruction and the interface being visible back from the host.

So I've tested a loop like this one:

while jls -dj jifdestroy >/dev/null 2>&1; do
        sleep 1
done

with a dummy script on 13.1-release like this one:

#!/bin/sh
#set -eu
ifconfig lo888 create
jail -c name=bug persist vnet vnet.interface=lo888
jexec bug ifconfig lo888 up
jail -R bug
while jls -dj bug >/dev/null 2>&1; do
        echo "wait"
        sleep 1
done
ifconfig lo888 destroy && echo "success" || echo "fails"

And it fails about 1 time on 5 runs:

 # sh -x ./yo.sh
+ ifconfig lo888 create
+ jail -c 'name=bug' persist vnet 'vnet.interface=lo888'
+ jexec bug ifconfig lo888 up
+ jail -R bug
+ jls -dj bug
+ ifconfig lo888 destroy
ifconfig: interface lo888 does not exist
+ echo fails
fails

So the only stable solution seems the racy sleep here.

Update comments following advices

yes the sleep is racy, but I have no idea to fix it.
Without it, the test will always fails: There is a delay between the jail destruction and the interface being visible back from the host.

So I've tested a loop like this one:

while jls -dj jifdestroy >/dev/null 2>&1; do
        sleep 1
done

with a dummy script on 13.1-release like this one:

#!/bin/sh
#set -eu
ifconfig lo888 create
jail -c name=bug persist vnet vnet.interface=lo888
jexec bug ifconfig lo888 up
jail -R bug
while jls -dj bug >/dev/null 2>&1; do
        echo "wait"
        sleep 1
done
ifconfig lo888 destroy && echo "success" || echo "fails"

And it fails about 1 time on 5 runs:

 # sh -x ./yo.sh
+ ifconfig lo888 create
+ jail -c 'name=bug' persist vnet 'vnet.interface=lo888'
+ jexec bug ifconfig lo888 up
+ jail -R bug
+ jls -dj bug
+ ifconfig lo888 destroy
ifconfig: interface lo888 does not exist
+ echo fails
fails

So the only stable solution seems the racy sleep here.

Checking that vnet is not dying is the right thing - once vnet is destroyed you should get the interface back. I’d prefer to have this logic in the test instead of a random sleep.
If this fails (on current head) I can take a look in a ~week time

Thanks for the tips about checking for dying state: This is the root cause is. The jail is stuck forever in dying state after destroying (without even using) the vnet interface.

olivier edited the test plan for this revision. (Show Details)
olivier added a reviewer: glebius.

Following new troubleshooting study from zlei.huang@gmail.com on PR, he identified the culprid commits and proposed a simpler way to reproduce it.
The problem isn't with vnet but the IP stack and jai, so rewrote the full tests to a simpler version and move it to the usr.sbin/jail tests.

I've removed white spaces found in this script too.

Since the test is no longer vnet-specific, don't forget to adjust the commit message. Also, does the polling loop still exhibit problems with the non-vnet version of the test? We've got to get rid of that "sleep 5".

usr.sbin/jail/tests/jail_basic_test.sh
73

This looks like a valid public IP address. You should use an RFC 5735 address instead, like 192.0.2.0/24.

jlduran_gmail.com added inline comments.
usr.sbin/jail/tests/jail_basic_test.sh
73

Maybe s/172/127/g would suffice?