Page MenuHomeFreeBSD

resolv_test: Fix racy exit check, remove mutexes, and reduce output
ClosedPublic

Authored by arichardson on Mar 23 2021, 11:21 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Dec 15, 2:23 AM
Unknown Object (File)
Fri, Dec 6, 6:47 AM
Unknown Object (File)
Wed, Dec 4, 4:22 PM
Unknown Object (File)
Wed, Dec 4, 4:22 PM
Unknown Object (File)
Wed, Dec 4, 4:22 PM
Unknown Object (File)
Wed, Dec 4, 4:21 PM
Unknown Object (File)
Wed, Dec 4, 3:59 PM
Unknown Object (File)
Mon, Dec 2, 10:32 AM
Subscribers

Details

Summary

Instead of polling nleft[i] (without appropriate memory barriers!) and
using sleep() to detect the exit just call pthread_join() on all threads.

Also replace the use of a mutex that guarding the increments with atomic
fetch_add. This should reduce the runtime of this test on SMP systems.

Finally, remove all the debug printfs unless DEBUG_OUTPUT is set in
the environment.

Test Plan

still fails sometimes on qemu (but maybe less often?)

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

lib/libc/tests/resolv/resolv_test.c
203

I think if you cared about the error you could probably write this simpler as:

atomic_add_explicit(&ask[i], 1, memory_order_relaxed);
if (error == 0)
    atomic_add_explicit(&got[i], 1, memory_order_relaxed);
else if (got[i] != 0)
    fprintf(stderr, ...);

Not sure why you needed the fetchadd for ask[i], and I think the race you are worried about here doesn't matter in terms of reading got[i]?

I think the real problem might be QEMU's builtin DNS being flaky (or something wrong in the kernel). However, the nleft[i] race was pointed out by TSan.

lib/libc/tests/resolv/resolv_test.c
203

I guess could do that instead, but since fetchadd returns the old value, there's no need to load it again?

Multiple threads are reading+writing to both ask[] and got[]. I think using _Atomic(int) * will ensure that the ++ is a fetchadd, but might as well be explicit.
If we don't use atomics we could get lost updates.

lib/libc/tests/resolv/resolv_test.c
203

In the code above, you don't load got[i] twice as you either increment it for error == 0, or you check it against 0 for error != 0. I find the current code a bit confusing in that case (it intentionally adds 0 sometimes). It was a cute trick to avoid a branch perhaps under the mutex (but not really), but since you have to check error explicitly for the new stderr message, you might as well avoid the cute "add 0" trick.

I understand you need atomics, but using 'fetchadd' instead of plain 'add' for ask[i] when you don't use the returned value is what I found confusing.

lib/libc/tests/resolv/resolv_test.c
203

Ah sorry I misunderstood. I'll update it to avoid the add 0

arichardson marked 2 inline comments as done.

Apply suggestion

This revision is now accepted and ready to land.Mar 25 2021, 8:49 PM