Page MenuHomeFreeBSD

tests/kern/ssl_sendfile: fix 'random' and 'basic' flakyness
ClosedPublic

Authored by glebius on Mon, Mar 9, 10:16 PM.

Details

Summary

The basic and random test cases checked c->sbytes immediately after draining all data from SSL_read, without waiting for the server thread to return from SSL_sendfile().
On a loopback connection, the client can consume all data from the kernel receive buffer before SSL_sendfile() returns on the server side, leaving c->sbytes unset at the point of the ATF_REQUIRE check.

Add wait_server() which holds the mutex and waits on the condvar until the server thread transitions out of RUNNING state, ensuring c->sbytes is stable before it is read.

This fixes a flaky failure appearing roughly once every 40 runs with error:

tests/sys/kern/ssl_sendfile.c:327: c.sbytes == (ssize_t)expect not met
Test Plan

Initialy I’ve increased the failure rate with a small patch by adding usleep(1) temporarily in the test:

ATF_REQUIRE(nread == expect);
usleep(1);  /* widen the race window */
ATF_REQUIRE(c.sbytes == (ssize_t)expect);

With this patch applied, I’ve ran stress-ng to load all my core stress-ng --cpu $(nproc) while running the test in loop thousand of time while sudo cpuset -l 0 kyua test sys/kern/ssl_sendfile:random; do :; done

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

olivier created this revision.
olivier edited the test plan for this revision. (Show Details)

Thanks a lot for the diagnosis and fix. Note that truncate, grow and offset_beyond_eof tests already have a very similar race protection. What about making the epilogue of those tests common? I'd suggest to combine the wait and the sbytes check into a single function.

glebius edited reviewers, added: olivier; removed: glebius.
glebius retitled this revision from tests/ssl_sendfile: fix race between client read and c->sbytes check to tests/kern/ssl_sendfile: fix 'random' and 'basic' flakyness.Tue, Mar 10, 5:47 PM
glebius edited the test plan for this revision. (Show Details)

This isn't actually a true race, but an unsynchronized read. No need to
sleep, just flush the caches.

This revision was not accepted when it landed; it landed in state Needs Review.Tue, Mar 10, 9:04 PM
This revision was automatically updated to reflect the committed changes.