Page MenuHomeFreeBSD

Hold an explicit reference on the socket for the aiotx task.
ClosedPublic

Authored by jhb on Jun 6 2019, 6:47 PM.
Tags
None
Referenced Files
F108520024: D20539.id59113.diff
Sat, Jan 25, 8:35 PM
Unknown Object (File)
Sat, Jan 11, 9:42 AM
Unknown Object (File)
Sat, Jan 11, 9:21 AM
Unknown Object (File)
Thu, Jan 9, 2:00 PM
Unknown Object (File)
Mon, Jan 6, 2:44 AM
Unknown Object (File)
Dec 23 2024, 2:25 PM
Unknown Object (File)
Nov 13 2024, 10:57 AM
Unknown Object (File)
Nov 5 2024, 4:44 AM
Subscribers

Details

Summary

Previously, the aiotx task relied on the aio jobs in the queue to hold
a reference on the socket. However, when the last job is completed,
there is nothing left to hold a reference to the socket buffer lock used
to check if the queue is empty. In addition, if the last job on the queue
is cancelled, the task can run with no queued jobs holding a reference
to the socket buffer lock the task uses to notice the queue is empty.

Fix these races by holding an explicit reference on the socket when the
task is queued and dropping that reference when the task completes.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

So while writing the log message, I decided I still don't really like this version. Namely, even after dequeueing the last job, we will then try to lock the socket buffer to check the queue, if we completed the last job we might have dropped the last reference to the socket and freed it. So I think the real fix is we have to grab a reference when queueing the task. I don't know if a reference on the socket is enough to keep inp->inp_socket itself stable though?

Make the aiotx task hold an explicit reference on the socket.

jhb retitled this revision from Ensure any pending task has stopped when cancelling the last aiotx job. to Hold an explicit reference on the socket for the aiotx task..Jun 6 2019, 7:11 PM
jhb edited the summary of this revision. (Show Details)

I ran the same aio tx test that used to panic the system and verified that this change fixes the problem.

This revision is now accepted and ready to land.Jun 27 2019, 7:16 PM