Page MenuHomeFreeBSD

Fix race in taskqueue_drain_timeout()
ClosedPublic

Authored by hselasky on Sep 23 2016, 1:07 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jan 18, 2:44 AM
Unknown Object (File)
Fri, Jan 17, 8:03 PM
Unknown Object (File)
Fri, Jan 17, 2:03 PM
Unknown Object (File)
Fri, Jan 17, 1:19 AM
Unknown Object (File)
Dec 9 2024, 3:56 AM
Unknown Object (File)
Dec 9 2024, 3:54 AM
Unknown Object (File)
Nov 26 2024, 3:25 AM
Unknown Object (File)
Nov 25 2024, 4:14 PM
Subscribers

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

hselasky retitled this revision from to Fix race in taskqueue_drain_timeout().
hselasky updated this object.
hselasky edited the test plan for this revision. (Show Details)
hselasky added a reviewer: kib.
hselasky set the repository for this revision to rS FreeBSD src repository - subversion.
sys/kern/subr_taskqueue.c
84 ↗(On Diff #20640)

1<<1

304 ↗(On Diff #20640)

Silently ignoring enqueue request and returning success is not reasonable KPI. IMO, an error must be returned there. Callers might require some adoption to the error.

bjk added inline comments.
share/man/man9/taskqueue.9
227 ↗(On Diff #20696)

s/Else/Otherwise,'

hselasky edited edge metadata.

Address manpage comments.

I think that this variant of the subr_taskqueue_enqueue_timeout is fine in isolation, i.e. if we ignore the callers.

Did you inspected all callers of the function in the head ? In fact, r306320 is what I mean when we discussed the issue elsewere, and I thnk that despite this change, all callers should be fixed to do the similar treatment to the 'stopped' flag as the mentioned revision.

@kib
Did you inspected all callers of the function in the head ?

There is currently only one place in the kernel where the return value of this function is checked and that is:

sys/fs/autofs/autofs.c:		error = taskqueue_enqueue_timeout(taskqueue_thread,
sys/fs/autofs/autofs.c-		    &ar->ar_task, autofs_timeout * hz);
sys/fs/autofs/autofs.c-		if (error != 0) {
sys/fs/autofs/autofs.c:			AUTOFS_WARN("taskqueue_enqueue_timeout() failed "
sys/fs/autofs/autofs.c-			    "with error %d", error);
sys/fs/autofs/autofs.c-		}

From what I can see, this patch will not cause a problem, because the place where the task is residing is a new allocated memory block, so taskqueue_enqueue_timeout() will always return zero.

FYI: r306320 doesn't fully fix the problem. We also need the reliable taskqueue drain.

kib added a reviewer: trasz.
kib edited edge metadata.

Ok.

trasz, note that return value from the taskqueue_enqueue_timeout is not an error now, and not quite an error after the patch in this revision is applied.

This revision is now accepted and ready to land.Sep 26 2016, 8:14 AM

@kib
Do you think the setting of the DT_DRAIN_IN_PROGRESS flag should be in its own function?

--HPS

Do you think the setting of the DT_DRAIN_IN_PROGRESS flag should be in its own function?

Do you mean, should you create two helper inline functions to set and clear the flag ? I do not think it is important since right now the functions are single-use.

In D8012#166290, @kib wrote:

Do you think the setting of the DT_DRAIN_IN_PROGRESS flag should be in its own function?

Do you mean, should you create two helper inline functions to set and clear the flag ? I do not think it is important since right now the functions are single-use.

Yes, I was just wondering if that would be useful in the case where you want to do flush instead of drain, like inserting a barrier. Right now there is no need for such a function.

I'll wait for @trasz to respond and then I'll push.

Thank you!

trasz edited edge metadata.

I don't think it'll break anything in autofs.

hselasky edited edge metadata.

Fix inverted KASSERT() check.

This revision now requires review to proceed.Sep 29 2016, 10:03 AM
This revision was automatically updated to reflect the committed changes.
wblock added inline comments.
head/share/man/man9/taskqueue.9
226

Passive -> active:

This function returns -1 if the queue is being drained.
Otherwise, the number of pending calls is returned.