Update AIO rusage counters asynchronously.
Needs RevisionPublic
Actions

Authored by tmunro on Dec 5 2021, 11:40 AM.

Details

Reviewers

asomers

Group Reviewers

transport

Summary

Currently, aio_return() and aio_waitcomplete() update rusage counters for the caller's thread, which (a) may not be the one that initiated the IO and (b) happens some arbitrary time later. This patch updates the rusage for the submitting thread as soon as the IO completes. Extracted from D33144 for separate review.

Test Plan

A test to exercise the thread exit code path is included.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

tmunro created this revision.Dec 5 2021, 11:40 AM

Herald added a reviewer: transport. · View Herald TranscriptDec 5 2021, 11:40 AM

Herald added subscribers: glebius, melifaro, delphij, imp. · View Herald Transcript

tmunro requested review of this revision.Dec 5 2021, 11:40 AM

tmunro mentioned this in D33144: Reap AIO completions asynchronously when using kqueue..Dec 5 2021, 11:42 AM

This is hard to review without more context. Could you please reupload the revision? If you're using the web UI, generate the diff with diff -U 9999 to ensure full context.

This revision now requires changes to proceed.Dec 5 2021, 2:40 PM

tmunro updated this revision to Diff 99496.Dec 5 2021, 8:12 PM

It would be nice to add a test case that ensures resource accounting happens on the correct thread.

sys/kern/vfs_aio.c
581	This is racy unless you use acquire/release semantics. You should use `atomic_load_acq_int` here and below and everywhere this variable is modified must use `atomic_store_rel_int` or similar.
595	If unsafe AIO is enabled, then this loop could block indefinitely. That's not very good. Is there a way to allow a thread to exit even in this case? What about if it it nulls out the `td` field for every active job of this thread, and then `aio_complete` checks whether that field is null before adjusting its counters? For that matter, AIO's resource usage isn't really associated with any one thread. The I/O itself doesn't happen in a thread's context. So it would be reasonable to account for it not in the submitting thread's `td_ru` structure but in the process's `p_ru` structure.
1081	Is it safe to acquire `job->td`'s thread lock here? If so you could eliminate the atomic ops, and you could also eliminate the wait for all I/O to complete in aio_thread_exit.
sys/sys/resource.h
180	I think it's good style to load these variables using atomic ops if you're going to store them that way, even if the assembly code will be the same. You should define `RU_ATOMIC_LOAD` and use it everywhere these variables are read. And that includes the struct copy in the penultimate line of `rufetchadd`.
tests/sys/aio/aio_test.c
1967	Why are you sure that a zero timeout will be sufficient? I think you should use a NULL timeout instead, to wait forever if necessary.

This revision now requires changes to proceed.Dec 6 2021, 12:10 AM

So I would suggest not accumulating the stats to the submitting thread. Two possibilities:

Lock the PROC_LOCK() of the process and add the AIO resource usage directly into p_ru members.

If you don't want to use the PROC_LOCK, add new members to struct proc to hold AIO resource usage counts and use atomics to update those. However, doing 4 atomics might not really be cheaper than the PROC_LOCK approach. If you do this, you also need to fix places like rufetch() and the code in exit1() to read the AIO counts. For rufetch() simple reads would be ok as the race is harmless. For exit1(), AIO will all be stopped so the values will be stable without needing a lock.