Page MenuHomeFreeBSD

nvme: Replace potentially long DELAY()'s with pause()'s
ClosedPublic

Authored by mav on Mar 17 2021, 2:45 AM.

Details

Summary

In some cases nvme(4) may wait minutes for controller response before timeout, for example in case of broken hardware. Doing so in a tight spin loop makes whole system unresponsive.

Test Plan

Without the patch hot-plug of broken NVMe SSD makes system console stuck for two minutes. With the patch system remains responsive and just prints "controller ready did not become 1 within 120500 ms" two minutes later.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

mav requested review of this revision.Mar 17 2021, 2:45 AM

Seems reasonable... we used to run this before interrupts were enabled, and now we don't so the DELAYs are a bit OBE now anyway.

sys/dev/nvme/nvme_ctrlr.c
273

Do we need to worry about this being 0 when hz < 1000?

1557

Ditto...

sys/dev/nvme/nvme_ctrlr.c
273

No for the last 10 years since 9e3ae31c7a935255092fff24544c3d120bbd15bc: "Given that the typical usage of pause() is pause("zzz", hz / N), where N can be greater than hz in some cases, simply ignore a timeout value of zero." The zero there is handled as one.

This revision was not accepted when it landed; it landed in state Needs Review.Mar 17 2021, 2:36 PM
This revision was automatically updated to reflect the committed changes.