Page MenuHomeFreeBSD

ufshci: revisit controller reset path and add I/O timeout handling
ClosedPublic

Authored by jaeyoon on Sep 9 2025, 3:58 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Oct 11, 7:19 AM
Unknown Object (File)
Sat, Oct 11, 12:02 AM
Unknown Object (File)
Sat, Oct 11, 12:01 AM
Unknown Object (File)
Sat, Oct 11, 12:01 AM
Unknown Object (File)
Sat, Oct 11, 12:01 AM
Unknown Object (File)
Sat, Oct 11, 12:01 AM
Unknown Object (File)
Sat, Oct 11, 12:01 AM
Unknown Object (File)
Fri, Oct 10, 5:36 PM
Subscribers

Details

Summary

This patch revisits the controller reset path and introduces timeout
handling for I/O commands.

To support controller reset during driver operation, the controller’s
construct, destruct, enable, and disable functions are clearly
separated in ufshci_ctrlr.c. ufshci_ctrlr_hw_reset() function is
added to leverage enable/disable.

After initialization, ufshci_ctrlr_reset_task() is also introduced to
ensure controller resets are performed via the task queue.

Timeout handling is designed in five steps. This patch implements
Step 1 and Step 5, while the remaining steps will be added later.
The timeout mechanism follows the same shared timeout model used in
the NVMe driver.

Test: Intentionally delayed UPIU I/O in QEMU to trigger a timeout and
verify timeout handling.

Sponsored by: Samsung Electronics

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Added some comments for reviewers.

sys/dev/ufshci/ufshci_ctrlr.c
16

This function was moved here from line 299.

29

This function was moved here from line 312 and resetting was added.

203

This function is an extracted part of the ufshci_ctrlr_construct() function. (lines 120, 137)

I expected the timeout handling to be in the CAM, but it seems the CAM only performs a retry and does not execute cmd abort and controller reset. Therefore, I implemented the cmd abort and controller reset within the ufshci driver. Is my understanding correct?

Fix queue contruct sequence

So this change is a bit larger than I like to review. But the moved code was simple (and I know I have comment about the moved code, and that maybe the NVMe driver isn't good about doing it true). In the future, though it would make it easiest if you had the 'move it' commits separate from the rest. The place I want a comment could get one later too, if that's easier to do.

Looking at the recovery algorithm, it does seem quite similar to the NVMe one. The only concern I have here is that we'll need to track it in two places. I've been tweaking it from time to time in the nvme driver, but I don't know how applicable future tweaks will be. I guess this is more of a "be aware" in the future comment.

sys/dev/ufshci/ufshci_ctrlr.c
33

so for these special cases, it might be good to document what's special about resetting

This revision is now accepted and ready to land.Sep 11 2025, 5:49 PM
This revision now requires review to proceed.Sep 12 2025, 1:13 AM
In D52440#1198791, @imp wrote:

So this change is a bit larger than I like to review. But the moved code was simple (and I know I have comment about the moved code, and that maybe the NVMe driver isn't good about doing it true). In the future, though it would make it easiest if you had the 'move it' commits separate from the rest. The place I want a comment could get one later too, if that's easier to do.

Looking at the recovery algorithm, it does seem quite similar to the NVMe one. The only concern I have here is that we'll need to track it in two places. I've been tweaking it from time to time in the nvme driver, but I don't know how applicable future tweaks will be. I guess this is more of a "be aware" in the future comment.

Thank you for the review! I think I rushed this a bit. Next time, I will separate code moves into a separate commit.
I implemented the recovery algorithm the same as NVMe. I’m also concerned about the duplicated logic. Would it be possible to move the recovery logic into CAM in the future?
For now, I will keep an eye on NVMe changes.

This looks better now. Doing things better next time is fine.

This revision is now accepted and ready to land.Thu, Sep 18, 12:59 AM