Move vm_snapshot_buf_cmp prototype under #ifndef _KERNEL.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Feb 6 2024
removed vm_snapshot_buf_cmp() in the kernel code
Removed vm_snapshot_buf_cmp() function in the kernel code.
Feb 5 2024
Dec 28 2023
In D43218#985023, @rew wrote:as a side thought, wondering if FB_SIZE and SNAPSHOT_BUFFER_SIZE should be computed on the fly at some point in the future.
Dec 5 2023
In D39620#978525, @sean_rogue-research.com wrote:Assuming the bug I see is indeed this bug, I reproduce in all 3 of my linux VMs, running Ubuntu 20.04 LTS and 22.04 LTS, which are all linux kernel 5.x. Looks like there is no LTS Ubuntu with kernel 6.x until upcoming 24.04. But 23.10 uses kernel 6.5, which I could try. Is kernel 6.x expected to fix something?
Dec 4 2023
In D39620#978503, @sean_rogue-research.com wrote:...
I'm not very familiar with phabricator, so I'm unsure: is this fixed in FreeBSD 13.2 or only 14.0?
Nov 7 2023
In D34718#969176, @freebsd_ny-central.org wrote:That's all. Kernel crashes with panic: general protection fault. Who wants to fix that ?)
I've read through the various open reviews and your suggestions for a new file format on the mailing list. Would a more robust file format fix that? I.e. adding a checksum or signing the file with a cryptographic hash?
Oct 17 2023
Jun 29 2023
Jun 22 2023
In D34718#925884, @gusev.vitaliy_gmail.com wrote:I suppose that fixing existing issues and make the code stable is more important for now.
Jun 21 2023
In D34718#925873, @elenamihailescu22_gmail.com wrote:Do you already have a migration feature/tool implemented (bmigrate)? If so, please let us know if you plan to make it available in the future.
Jun 20 2023
In D34718#925665, @mgrooms_shrew.net wrote:In D34718#925619, @gusev.vitaliy_gmail.com wrote:So instead of having a lot code in bhyve, I would suggest to have portion of primitives that can help to implement warm, live migration outside of bhyve process.
Wait. You're casually suggesting that years of work be discarded in favor of an alternate approach. Please be more specific about what you perceive to be the problem with the method taken in this patch series. "I have another idea" isn't a very good reason to go in a completely different direction.
In D34718#925608, @elenamihailescu22_gmail.com wrote:We consider that the migration part is a must-have feature for a mature hypervisor and we really want to have it for bhyve. The end goal is to have a live migration feature for bhyve. It means inspecting the guest memory pages to check for changes between rounds. It also modifies a dirty bit. The inspection (and changes) should be done from within the bhyve process. I'm not sure how this part should be tackled in your suggestion. However, please keep in mind that this approach was discussed with the bhyve's maintainers 4 years ago. We asked for advice and came up with this idea of having a custom dirty bit for migration.
In D34718#925598, @darius.mihaim_gmail.com wrote:The *live* migration must be part of bhyve, as the migration process must transfer the virtual machine's memory from one host to the other while the virtual machine runs on the original host.
Of course, some changes to the migration code are unavoidable in case the snapshot process changes in some particular ways, but they should remain minimal in my expectations.
In D34718#925583, @darius.mihaim_gmail.com wrote:In D34718#925207, @afedorov wrote:I think we should bring support for snapshots first. And enable BHYVE_SNAPSHOT by default.
To do this, it remains to agree on the snapshot format: https://lists.freebsd.org/archives/freebsd-virtualization/2023-May/001295.html
I'm not sure whether to tie the transfer of the memory dump to the TCP.
Or add an opportunity similar to 'zfs send' to be independent of transport. For example, share a memory image using SSH or NFS.
Not sure how the transfer would be *tied* to the TCP protocol just by having the implementation use TCP by default. Many tools like curl, socat and browsers have figured out that you can use URIs to specify protocols and even authentication when connecting to a service. The migration parameters can be extended to use ssh://, file:/// or even the basic - for piping the binary data to standard output.
I think asking this question more than one year after this review has been open, and more than two after the original review (https://reviews.freebsd.org/D28270) was created is simply meant to delay accepting the changes. I understand that having a good and secure code base is a must, but this change simply asks for "why don't we over engineer this part which has very little to do with the actual virtual machine migration process?", without actually providing a good case for "this is not up to the BSD standards".
The same for this review. Is it really required to place this code into bhyve or it can be added to upper layer, for example to bhyvectl?
Before proceed this work to commit, could you provide high level design for this approach ?
Jun 14 2023
In D40105#923164, @jhb wrote:Can you provide some before/after JSON snippets?
Jun 7 2023
In D40105#921197, @jhb wrote:Can you provide more description of what this change is doing? From what I can tell it uses err directly to abort quicker if it fails to restore an in-kernel structure, it inlines vm_restore_kern_struct into the loop in vm_restore_kern_structs. It renames vm_snapshot_kern_structs which is somewhat gratuitous IMO as it now no longer matches the other function names like vm_snapshot_user_devs. But I think the big change is not using lookup_struct and instead using meta->dev_name instead of meta->dev_req as the main key for the JSON for a given kernel struct? I think this means you can avoid lookup_struct because now instead of a flat array of all kernel structures they are separate objects with unique names? Can you expand on that more perhaps maybe with some examples of before/after JSON snippets? I would also suggest perhaps only doing the JSON change in this commit and not mixing in the other changes that I think obscure the real change you are making (e.g. the function rename, or inlining the function).
May 17 2023
Fixed note from @corvink
In D40108#913868, @corvink wrote:Why don't we call pci_snapshot_pci_dev unconditionally even if pe_snapshot is NULL?
May 16 2023
@rew Could you look again ?
May 15 2023
Fixed notes from @rew.
In D40108#913426, @gusev.vitaliy_gmail.com wrote:In D40108#913423, @rew wrote:not sure that devices should define a snapshot handler when not supported by the device.
At least it saves pdi.pi_cfgdata.
Close as whole patch series D38886 is committed.
In D40108#913423, @rew wrote:not sure that devices should define a snapshot handler when not supported by the device.
May 10 2023
May 8 2023
In D35447#910663, @corvink wrote:Patch looks good but it doesn't apply any more. Could you send an updated version please?
Rebased on main.
May 5 2023
@corvink Could you look at this ? While Patrick's changes requires (in comment) a lot of modifying generic code, this simple fix could solve issue related for suspend/resume.
May 2 2023
In D35590#908783, @corvink wrote:@gusev.vitaliy_gmail.com As said, please split this commit. You can also send me a link to a personal git repo where I can pull the changes from.
Apr 25 2023
@corvink Can it be committed or I should improve/correct it ?
Apr 24 2023
Correct mention of SDM : Section 29.2.1 --> Section 30.2.1
In D39620#905476, @markj wrote:What exactly is the problematic scenario? How do we end up with a pending, undelivered interrupt after a vmexit? Presumably the guest must have enabled interrupts before executing HLT.
The change itself seems reasonable to me, but I'd like to understand it better.
up?
Apr 19 2023
Apr 18 2023
Apr 17 2023
Mar 27 2023
Mar 14 2023
Mar 13 2023
Mar 10 2023
In D35590#887928, @rew wrote:In D35590#887785, @gusev.vitaliy_gmail.com wrote:That will work, but this is not easier than current review's approach.
This counter-argument lacks any technical substance.
Mar 9 2023
@rew Of course if assume that only PCI and Keyboard devices are used, introducing
Rebased
In D38860#887537, @rew wrote:In D38860#887381, @rew wrote:https://reviews.freebsd.org/D38858 also needs to be addressed before this patch is committed.
I've dropped my request for changes in D38858 - there's nothing blocking this review from being landed.
Corrected fd_meta usage in case of error.
In D35454#887531, @emaste wrote:FYI if you use just the D# notation Phabricator will add markup to show which reviews are open/closed e.g.
D38866 libvmm: Add missed ioctl to vm_ioctl_cmds
D38855 bhyve: [snapshot] Do not flush readonly device at blockif_pause()
D38856 bhyve: [snapshot] Add cap limits for ipc socket
D38872 bhyve: Exit with EX_OSERR if init checkpoint or restore time failed
D38857 bhyve: Init checkpoint before caph_enter()
D38858 bhyve: Use directory fd with checkpoint
D38860 bhyve: Enable Capsicum for snapshots
Mar 8 2023
In D35590#886738, @rew wrote:In D35590#886683, @gusev.vitaliy_gmail.com wrote:The main idea of this review - introduce struct snapshot_dev and it will be used by the following nvlist changes.
Seems like it'd make more sense to generate this meta information from the in-core configuration at the time the snapshot is taken.
Have you considered this?
Mar 7 2023
In D35590#886517, @rew wrote:In D35590#886289, @gusev.vitaliy_gmail.com wrote:I guess you think that I will upstream this changes and forgot about format work. But I am going to create nvlist (including format change) as soon as that review committed.
Before doing the work, I strongly encourage you to write an email to the virtualization mailing list with the proposed file format.
In D35590#886608, @corvink wrote:Only the file format is controversal. So, if you split this review into smaller ones which are unrelated to the file format change, we can make some progress.
@markj Could you re-approve and commit ?
Mar 6 2023
Fixed "_DP_9p= sbuf" indentation.
In D35590#885880, @rew wrote:How is this a stopper for the file format changes?
In D38905#885863, @markj wrote:Seems ok. Did you test 9pfs in the guest? I would suggest trying it if so. In particular, we have no other examples where a shared library depends on casper libraries. Up until now, it is always executables which depend on casper libraries. So there might be some subtle problem or interaction.
Mar 4 2023
@rew This is stopper for following up nvlist changes. Please review and possibly approve. Thanks.
Mar 3 2023
In D38858#885643, @rew wrote:In D38858#885637, @gusev.vitaliy_gmail.com wrote:Why do you think single format work (nvlist) should be *before* enabling Capsicum? nvlist work will have a lot of changes and we can stuck if discussing any useful thing eats a lot of time and risk is not to move forward for the months. Just note, multiple device review is 7-8 months old. It is crazy for the code that is under #ifdef BHYVE_SNAPSHOT. We need speedup this work. For now it is about 2 years when Snapshot/Resume was added and there no significant progress.
If you have questions or notes about the code, please write here. Otherwise please accept the reviews.
The only file descriptor that needs to be passed to bhyve is the snapshot file that is being written to, which won't require opening a file descriptor to a directory.
Rebase this review after the file format work has been merged into main.
Thanks.
In D38858#885635, @rew wrote:In D38858#885606, @gusev.vitaliy_gmail.com wrote:nvlist implementation will use single file for all:
- config
- vram
- kernel data
- devices data
So you are right, using multiple files is not reasonable and is hard to operate.
In which case, the file descriptor change needs to happen after bhyve starts using a single file for snapshots.
In D38858#885604, @rew wrote:
..
see the original commit message that brought the snapshot code in
The file format also does not currently support versioning of individual chunks of state. As a result, the current file format is not a fixed binary format and future revisions to save and restore will break binary compatiblity of snapshot files. The goal is to move to a more flexible format that adds versioning, etc. and at that point to commit to providing a reasonable level of compatibility.If I understand right, all things could be achieved with nvlist implementation. Thanks.
Why does there need to be multiple files for a single snapshot?