Paths

Table of Contentst

Differential D54695

pf: tests: Introduce wait_for_process()
AbandonedPublic
Actions

Authored by jlduran on Jan 13 2026, 10:12 PM.

Details

Reviewers

Group Reviewers

tests

Summary

Introduce a new function that waits for processes to start with a
configurable timeout and interval. By default, it sleeps for 1 second
intervals for up to 30 seconds waiting for a process to start.

It tries to address sporadic failures in test pflog:rdr_action which may
be caused by an insufficient sleep time during tcpdump startup. The new
function provides reliable process detection with configurable
parameters for different timing requirements.

The implementation uses pgrep to detect running processes and includes
proper timeout handling to prevent indefinite waiting, as well as an
optional jail parameter. There are many "sleep 1" calls in the test
suite that can be replaced with this helper function.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 69866
Build 66749: arc lint + arc unit

Event Timeline

jlduran created this revision.Jan 13 2026, 10:12 PM

Herald added subscribers: vegeta_tuxpowered.net, glebius, melifaro and 2 others. · View Herald TranscriptJan 13 2026, 10:12 PM

jlduran requested review of this revision.Jan 13 2026, 10:12 PM

Simplify jail processing
Exit rather than return

jlduran added reviewers: tests, kp.Jan 13 2026, 11:01 PM

This might be an overkill, but I would like to know if a similar helper function could help avoid failure intermittence (flakiness) in some tests. I reproduced this test natively on a very fast aarch64 machine, and it did not fail once, comparing the output of a successful pass vs. a failed one makes me wonder if it is not sleeping enough:

For example:
Reference GOOD: https://ci.freebsd.org/view/Test/job/FreeBSD-main-amd64-test/27610/testReport/junit/sys.netpfil.pf/pflog/rdr_action/ (galahad2.nyi.freebsd.org)
Reference BAD: https://ci.freebsd.org/view/Test/job/FreeBSD-main-amd64-test/27611/testReport/junit/sys.netpfil.pf/pflog/rdr_action/ (galahad2.nyi.freebsd.org)

Harbormaster completed remote builds in B69863: Diff 169578.Jan 14 2026, 12:50 AM

Harbormaster completed remote builds in B69866: Diff 169581.

I'm not sure this is sufficient. It is still possible for tcpdump to have started, but not gotten to the point of actually opening the pflog device.

I've had a very quick look at the tcpdump code, and I think it looks like we could potentially rely on the existence of a capture file (so -w <file>). I think that file only gets created after we've done pcap_open()/pcap_setfilter(). That'd require a bit of test-reworking though, because we'd be saving a pcap file, which we'd have to translate back to text to run the checks on.

tests/sys/netpfil/pf/pflog.sh
384	I'd keep this in. It can be useful when debugging test failures.

In D54695#1249657, @kp wrote:

I'm not sure this is sufficient. It is still possible for tcpdump to have started, but not gotten to the point of actually opening the pflog device.

I've had a very quick look at the tcpdump code, and I think it looks like we could potentially rely on the existence of a capture file (so -w <file>). I think that file only gets created after we've done pcap_open()/pcap_setfilter(). That'd require a bit of test-reworking though, because we'd be saving a pcap file, which we'd have to translate back to text to run the checks on.

Ah, yes! I'll experiment -w <file> with atf_check -r instead.

I'll just submit the trivial rdr_action_head() fix and abandon the rest.
I'll wait for the test timeouts dust to settle and simply try sleeping a bit more once I gather all the flaky tests.
Thank you for your input!

Revision Contents
Changeset List

Path

Size

tests/

sys/

common/

vnet.subr

93 lines

netpfil/

pf/

pflog.sh

10 lines

Diff 169581

View Options

tests/sys/common/vnet.subr