Page MenuHomeFreeBSD

bhyve - Snapshot Save and Restore multiple devices
Needs ReviewPublic

Authored by on Sep 10 2020, 10:21 AM.
Referenced Files
Unknown Object (File)
Mon, May 6, 7:26 PM
Unknown Object (File)
Fri, May 3, 5:28 PM
Unknown Object (File)
Tue, Apr 30, 6:26 PM
Unknown Object (File)
Tue, Apr 30, 6:26 PM
Unknown Object (File)
Tue, Apr 30, 6:25 PM
Unknown Object (File)
Tue, Apr 30, 3:46 PM
Unknown Object (File)
Sat, Apr 20, 6:19 PM
Unknown Object (File)
Apr 17 2024, 1:27 AM


Group Reviewers

This feature implements save-restore for multiple devices of the same type.

Initially the restore feature only restored the first device of a certain type.

The feature was rebased over commit 4a16bd8bb4cf1217720c21ae363626df51474edd.

Test Plan

For a guest that has 2 ahci disks and 2 virtio-net network adapters the teste cases were:

  • create a big file on one of the disk
  • start a copy process of the large file from one disk to the other
  • suspend the guest
  • restore the guest and the copy process should finish and the files are the same in data and size
  • initiate a ping process one virtio-net adapter at a time and suspend the guest
  • on restore the ping process should continue for the used adapter

Diff Detail

rG FreeBSD src repository
Lint Skipped
Tests Skipped

Event Timeline

vm_snapshot_kern_structs was called two times and probably caused a problem with save restore for multiple devices

vm_snapshot_kern_structs was called two times and probably caused a problem with save restore for multiple devices

Besides my another comments and questions, what is advantage of changing order of the vm_snapshot_kern_structs and vm_snapshot_user_devs (if look at the diff) ?


calloc(3) initialises all memory with zero, so you don't need explicitly assign was_restored to 0.

Also this is related to all below "was_restored = 0" changes.


Instead allocating dev_info, I would like to suggest to use one of the following:

  1. Use "sc" that already allocated in the line 2345, i.e. put some fields to sc's struct for every PCI type.
  2. Introduce a functions like:

    snapshot_pci_device_register(pi)"; for pci devices snapshot_device_register(cb, data, name); for non-pci devices

And replace all init code for all devices with those functions.

This make code simpler and incapsulate all allocations, etc. Thus each decl "struct vm_snapshot_dev_info *dev_info" and code below will be removed in xxx_init() functions.


If meta_data always should be not NULL, just place assert() here.


I would suggest to call pci_pause()/pci_resume() unconditionally for each PCI device and if pe_pause is NULL do not do anything. The same for pci_resume().
That will simplify initialisation and if PCI device has implemented pe_pause/pe_resume, a handler will be called during suspend/resume. If PCI device does not implement pe_pause/pe_resume, just ignore it.


Probably you should correct a code to conform TODO's requirements and then remove TODO comment.


Is this comment still required?


What is this function currently used for? It only calls lookup_devs().

Probably lookup_devs() should be renamed with walk_and_restore_devices()


commented code - bad style ?


Put fprintf under ifdef DEBUG or even replace with dprintf?


Some of the information will need to be used outside the context of the device itself - the static array of user devices has been removed, so the snapshot component must keep track of the functions that need to be called when performing a snapshot or restore. Also, for devices like atkbdc, the snapshot_pci_device_register will likely not be compatible.

Before having support for atkbdc, the static userspace device array was also not required, since you could discover most devices through the PCI bus; however, if other device types are to be supported, a more generic approach is required.

However, I agree that some of the logic, like allocating the structure and performing a part of the initialization could be done separately.


Changing an order of vm_snapshot_kern_struct() and vm_snapsshot_user_devs() makes kernel structure corruption on resume and host panic. For example, I got panic at restore_guest_fpustate() when vcpu->guest_xcr0 became corrupted:

This is due to vm_snapshot_kern_struct() resets "offset" to 0 while vm_snapsshot_user_devs() uses lseek(SEEK_CUR).

Removed redundant lines; applied most of the recommended changes.

! In D26387#593374, wrote:
Removed redundant lines; applied most of the recommended changes.

So I am adding rework of this review multiple-devices-v1.patch. Please look at this. Probably, if this patch is considered as good and better, I can
upload here to the review or can create another review.

Idea is: using unique name as identifier. For PCI devices it would be, for example, virtio-blk-pci-$bus-$slot-$func


  • Use unique name to identify device.
  • Add simple register_snapshot_dev() function, for any type of device.
  • Changed some function names and defs related to "user devices" and renamed to "devices"
  • Renamed "struct" section to "kern_structs"
  • Removed pci_find_slotted_dev function

For example: "devices" section in $dumpfile.meta would be:

"devices": [
      "device": "lpc-pci-0-31-0",
      "size": 520,
      "file_offset": 80418
      "device": "virtio-net-pci-0-4-0",
      "size": 66180,
      "file_offset": 80938
      "device": "virtio-blk-pci-0-3-0",
      "size": 8819,
      "file_offset": 147118
      "device": "hostbridge-pci-0-0-0",
      "size": 436,
      "file_offset": 155937
      "device": "atkbdc",
      "size": 141,
      "file_offset": 156373

Rebased with a newer upstream and applied the changes from the review