syslogd/tests: Address races
ClosedPublic
Actions

Authored by markj on Jan 19 2026, 3:41 PM.

Details

Reviewers

jfree
jlduran

Commits

rGb4036ae6cc77: syslogd/tests: Address races
rG560c22937ba9: syslogd/tests: Address races

Summary

I occasionally see failures in the syslogd test suite. The problem is
that the tests are racy: they often 1) send a message using logger(1),
then 2) immediately check whether the message was logged to a log file.
If the syslogd instance under test doesn't get a chance to run before
step 2 the test fails.

This change reworks things to avoid the race while minimizing the amount
of time sleeping.

Each test uses a single logfile, so have them use a new common variable, SYSLOGD_LOGFILE, instead of something test-specific.
In syslogd_start(), if the configuration references SYSLOGD_LOGFILE, wait for it to be created by syslogd before returning. Record its modification time.
Add a helper syslogd_check_log(), to check for a given log entry in the last line of SYSLOGD_LOGFILE, instead of using atf_check directly.
In syslogd_check_log(), wait for the modification time of the logfile to change relative to syslogd_startup or a previous syslogd_check_log() call, before checking for the desired log entry.

With this change, I was able to run the tests 1000 times in a loop with
4-way parallelism without seeing any test failures. Without the change
I usually get a failure within 10 loops.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 69985
Build 66868: arc lint + arc unit

Event Timeline

markj created this revision.Jan 19 2026, 3:41 PM

Herald added a subscriber: imp. · View Herald TranscriptJan 19 2026, 3:41 PM

markj requested review of this revision.Jan 19 2026, 3:41 PM

markj added a parent revision: D54778: syslogd/tests: Use a helper function to log from within a jail.

Harbormaster completed remote builds in B69985: Diff 170044.Jan 19 2026, 3:41 PM

Nice! I may steal a few ideas for some other similar tests.
I made a small observation, that can be ignored, I'm just thrilled with my recent discovery of atf_check -r.

usr.sbin/syslogd/tests/syslogd_test_common.sh
200	I wonder if almost the entirety of this function can be replaced with: atf_check -r 50 test $(cat MTIME) != ${mtime}

This revision is now accepted and ready to land.Jan 19 2026, 8:27 PM

markj added inline comments.Jan 19 2026, 10:44 PM

usr.sbin/syslogd/tests/syslogd_test_common.sh
200	Oh huh. I think I can use that to get rid of MTIME entirely: when we're looking for logfile entries, just do something like: atf_check -r 5 -o match:"${msg}" tail -n 1 "${SYSLOGD_LOGFILE}" I'll give that a try.

markj added inline comments.Jan 19 2026, 11:57 PM

usr.sbin/syslogd/tests/syslogd_test_common.sh
200	This almost works, but there's a small issue: if the command times out, as it will in some syslogd tests which are marked expected-to-fail, the test result is "broken" instead of "expected failure". I think I'll just modify those test cases to use syslogd_check_log_nopoll() instead, as the -r trick makes this patch much simpler.

Follow jlduran's suggestion of using atf_check -r.

This revision now requires review to proceed.Jan 19 2026, 11:58 PM

Harbormaster completed remote builds in B70000: Diff 170084.Jan 19 2026, 11:58 PM

Very nice!

usr.sbin/syslogd/tests/syslogd_test_common.sh
200	I think that is a bug in atf-check. Will investigate.

This revision is now accepted and ready to land.Jan 20 2026, 1:37 AM

markj added inline comments.Jan 20 2026, 1:51 PM

usr.sbin/syslogd/tests/syslogd_test_common.sh
200	Yeah, I tend to agree. I think if the global test timeout elapses, then the test should be marked broken regardless of whether it's expected to fail, but individual atf_check timeouts should be handled the same as if atf_check itself had failed.

markj added a child revision: D54799: syslogd/tests: Improve lo0 initialization.Jan 20 2026, 5:10 PM

jlduran added inline comments.Jan 21 2026, 3:10 PM

usr.sbin/syslogd/tests/syslogd_test_common.sh
200	More info on the "bug": If you use any timeout less than 5, it works as expected: atf_check -r 4 -o match:"${msg}" tail -n 1 "${SYSLOGD_LOGFILE}"

markj added inline comments.Jan 21 2026, 3:31 PM

usr.sbin/syslogd/tests/syslogd_test_common.sh
200	Ahh, good find, we have this `set_common_atf_metadata()` function which sets the timeout. syslogd/tests/Makefile already sets a timeout of 20s for each test, which I think is fine... what do you think of simply removing this 5s timeout?

jlduran added inline comments.Jan 21 2026, 7:18 PM

usr.sbin/syslogd/tests/syslogd_test_common.sh
200	Ah, yes, I agree. Also, you should be able to remove `syslogd_check_log_nopoll()`, as `syslogd_check_log()` will fail as expected after 10 seconds (and not broken).