Page MenuHomeFreeBSD

resolv_test: Fix racy exit check, remove mutexes, and reduce output
ClosedPublic

Authored by arichardson on Tue, Mar 23, 11:21 AM.

Details

Summary

Instead of polling nleft[i] (without appropriate memory barriers!) and
using sleep() to detect the exit just call pthread_join() on all threads.

Also replace the use of a mutex that guarding the increments with atomic
fetch_add. This should reduce the runtime of this test on SMP systems.

Finally, remove all the debug printfs unless DEBUG_OUTPUT is set in
the environment.

Test Plan

still fails sometimes on qemu (but maybe less often?)

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

lib/libc/tests/resolv/resolv_test.c
202

I think if you cared about the error you could probably write this simpler as:

atomic_add_explicit(&ask[i], 1, memory_order_relaxed);
if (error == 0)
    atomic_add_explicit(&got[i], 1, memory_order_relaxed);
else if (got[i] != 0)
    fprintf(stderr, ...);

Not sure why you needed the fetchadd for ask[i], and I think the race you are worried about here doesn't matter in terms of reading got[i]?

I think the real problem might be QEMU's builtin DNS being flaky (or something wrong in the kernel). However, the nleft[i] race was pointed out by TSan.

lib/libc/tests/resolv/resolv_test.c
202

I guess could do that instead, but since fetchadd returns the old value, there's no need to load it again?

Multiple threads are reading+writing to both ask[] and got[]. I think using _Atomic(int) * will ensure that the ++ is a fetchadd, but might as well be explicit.
If we don't use atomics we could get lost updates.

lib/libc/tests/resolv/resolv_test.c
202

In the code above, you don't load got[i] twice as you either increment it for error == 0, or you check it against 0 for error != 0. I find the current code a bit confusing in that case (it intentionally adds 0 sometimes). It was a cute trick to avoid a branch perhaps under the mutex (but not really), but since you have to check error explicitly for the new stderr message, you might as well avoid the cute "add 0" trick.

I understand you need atomics, but using 'fetchadd' instead of plain 'add' for ask[i] when you don't use the returned value is what I found confusing.

lib/libc/tests/resolv/resolv_test.c
202

Ah sorry I misunderstood. I'll update it to avoid the add 0

arichardson marked 2 inline comments as done.

Apply suggestion

This revision is now accepted and ready to land.Thu, Mar 25, 8:49 PM