Page MenuHomeFreeBSD

Implement nvme suspend / resume for pci attachment
ClosedPublic

Authored by imp on Sep 3 2019, 3:39 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 9, 11:04 PM
Unknown Object (File)
Dec 8 2024, 9:15 AM
Unknown Object (File)
Dec 4 2024, 6:47 PM
Unknown Object (File)
Nov 27 2024, 4:25 AM
Unknown Object (File)
Nov 20 2024, 10:42 PM
Unknown Object (File)
Nov 18 2024, 11:47 PM
Unknown Object (File)
Nov 18 2024, 4:30 PM
Unknown Object (File)
Nov 18 2024, 2:50 AM

Details

Summary

When we suspend, we need to properly shutdown the NVME controller. The
controller may go into D3 state (or may have the power removed), and
to properly flush the metadata to non-volatile RAM, we must complete a
normal shutdown. This consists of deleting the I/O queues and setting
the shutodown bit. We have to do some extra stuff to make sure we
reset the software state of the queues as well.

On resume, we have to reset the card twice, for reasons described in
the attach funcion. Once we've done that, we can restart the card. If
any of this fails, we'll fail the NVMe card, just like we do when a
reset fails.

Pass if we're in resetting or not to nvme_ctrlr_start. We use that on
the resume path, even though we're not really resetting (we set
is_resetting out of paranoia, in case something else tries to initiate
a reset during resume, we block it). However, other than the taskqueue
interlock, we only use is_resetting inside of nvme_ctrlr_start, so
it's best to just pass it in since we know why we're calling it.

Export a number of use ctrlr functions. Also, rename a
nvme_ctrlr_destory_qpairs to nvme_ctrlr_delete_qpairs to avoid
confusion with all the other destroy functions.
nvme_ctrlr_delete_qpairs just removes the queues in hardware, while
the other _destroy_ functions tear down data structures.

Finally, split parts of the hardware reset function up so that I can
do part of the reset in suspsend. Split out the software disabling
of the qpairs into nvme_ctrlr_disable_qpairs.

Test Plan

I've suspended and resumed a dozen times on one machine... time for review :)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

imp added reviewers: jimharris, mav, scottl.
sys/dev/nvme/nvme_pci.c
56 ↗(On Diff #61572)

I'm unclear if this will work with the ahci attachment, so I opted for conservative first roll-out. However, I don't see anything PCI specific in where I wound up... But I'll need a report from someone who is running with ahci attachment to confirm.

358 ↗(On Diff #61572)

It's unclear if I really need to do this. is_resetting = 1 is likely enough, though if someone suspends at the exact moment that someone resets, there will be a race that I'm unsure how best to cope with.

sys/dev/nvme/nvme_pci.c
56 ↗(On Diff #61572)

Also, fewer things would need to be exported out of nvme_ctrlr if there were a nvme_ctrlr_suspend / nvme_ctrlr_resume instead, though we'd still need to do the other refactoring I did...

Move things to export less from nvme_ctrlr

scottl added inline comments.
sys/dev/nvme/nvme_pci.c
358 ↗(On Diff #61572)

If this becomes a problem then my suggestion is to create a work loop thread to serialize execution for all codepaths that need to set or check this flag..

This revision is now accepted and ready to land.Sep 3 2019, 5:28 AM

Redid with the code move into nvme_ctrlr... tests fine.. like that better (will make ahci easier to do too)...

sys/dev/nvme/nvme_pci.c
358 ↗(On Diff #61572)

I think that waiting on is_resetting may suffice. Right now it's used to avoid concurrently submitting a reset to the taskqueue at the higher levels. I think waiting for any pending reset to complete before we suspend should suffice, if there's really an issue here... I suspect ~nobody will ever see it absent an aggressive shell script that loops doing zzz and nvmecontrol reset...

Improve comments. Wait for reset to finish. stay in the 'resetting'
state until resume completes (though this seems to only affect
queueing of new commands on I/O completions which shouldn't be
possible w/o the hardware queues present....)

This revision now requires review to proceed.Sep 3 2019, 2:44 PM
imp marked an inline comment as done.Sep 3 2019, 2:48 PM
This revision was not accepted when it landed; it landed in state Needs Review.Sep 3 2019, 3:26 PM
This revision was automatically updated to reflect the committed changes.