Page MenuHomeFreeBSD

Improve bhyve exit(3) error code.
ClosedPublic

Authored by araujo on Jul 6 2018, 9:04 AM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Jan 17, 10:22 PM
Unknown Object (File)
Fri, Jan 17, 12:16 PM
Unknown Object (File)
Sun, Jan 12, 1:44 AM
Unknown Object (File)
Fri, Jan 10, 4:14 AM
Unknown Object (File)
Tue, Jan 7, 2:53 PM
Unknown Object (File)
Sat, Dec 28, 4:59 PM
Unknown Object (File)
Dec 20 2024, 10:20 AM
Unknown Object (File)
Nov 30 2024, 5:00 PM
Subscribers

Details

Summary

The bhyve(8) exit status indicates how the VM was terminated:

0       rebooted
1       powered off
2       halted
3       triple fault

The problem is when we have wrappers around bhyve that parses the exit error code and gets an exit(1) for an error but inteprets it as "powered off".
So to mitigate this issue and makes it less error prone for third part applications, I have replaced exit(3) with value 1 to err(3) with a proper exit error from sysexits(3).

Test Plan

As an example without patch:

  1. "Could not open backing file":

root@mp03:/ # bhyve -A -H -P -c 3 -m 8098M -s 0:0,hostbridge -s 31,lpc -l com1,/dev/nmdm1A -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -s 29,fbuf,tcp=10.20.21.2:6032,w=800,h=600 -s 30,xhci,tablet -s 4,virtio-blk,/dev/la/la -s 5,virtio-net,tap10,mac=00:a0:98:2c:d5:9f FreeBSD
bhyve: Could not open backing file: /dev/la/la: No such file or directory
Could not open backing file: No such file or directory
root@mp03:/ # echo $?
1

  1. "Invalid PCI slot number":

root@mp03:/ # bhyve -A -H -P -c 3 -m 8098M -s 0:0,hostbridge -s 31,lpc -l com1,/dev/nmdm1A -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -s 29,fbuf,tcp=10.20.21.2:6032,w=800,h=600 -s 30,xhci,tablet -s 4,virtio-blk,/dev/la/la -s 50000000,virtio-net,tap10,mac=00:a0:98:2c:d5:9f FreeBSD
Invalid PCI slot info field "50000000,virtio-net,tap10,mac=00:a0:98:2c:d5:9f"
root@mp03:/ # echo $?
1

Example with this patch:

  1. Could not open backing file":

root@mp03:/ # bhyve -A -H -P -c 3 -m 8098M -s 0:0,hostbridge -s 31,lpc -l com1,/dev/nmdm1A -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -s 29,fbuf,tcp=10.20.21.2:6032,w=800,h=600 -s 30,xhci,tablet -s 4,virtio-blk,/dev/la/la -s 5,virtio-net,tap10,mac=00:a0:98:2c:d5:9f FreeBSD
bhyve: la Could not open backing file: /dev/la/la: No such file or directory
Could not open backing file: No such file or directory
bhyve: device emulation error in its initialization
root@mp03:/ # echo $?
69

  1. "Invalid PCI slot number":

root@mp03:/ # bhyve -A -H -P -c 3 -m 8098M -s 0:0,hostbridge -s 31,lpc -l com1,/dev/nmdm1A -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -s 29,fbuf,tcp=10.20.21.2:6032,w=800,h=600 -s 30,xhci,tablet -s 4,virtio-blk,/dev/la/la -s 500000000,virtio-net,tap10,mac=00:a0:98:2c:d5:9f FreeBSD
Invalid PCI slot info field "500000000,virtio-net,tap10,mac=00:a0:98:2c:d5:9f"
bhyve: invalid pci slot
root@mp03:/ # echo $?
64

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Looks good over all, just some nits on errx vs err when it appears that there is a proper errno avaliable.

I am not so sure on using sysexits.h either, would rather have a bhyve specific list of error codes as is currently implemented (1 to 4), only expanded to include the any many error conditons that it can exit on.

I have hated tracing down why I get an exit(1) in the code, though this does improve on that some, most of them are just going to be a small set of sysexits codes, and it would greatly help if each exit code was unique and pointed to a very specific problem.

usr.sbin/bhyve/bhyverun.c
845 ↗(On Diff #44960)

Shouldnt this be err() since perror before would of printed the string for errno?

859 ↗(On Diff #44960)

Again, errx vs err as here we should have errno set?

880 ↗(On Diff #44960)

errx vs err

1023 ↗(On Diff #44960)

This is usually not a temporary condition, it is almost certainly requesting more vcpu than bhyve supports and that is, at this time, a compiled in value of 16.

usr.sbin/bhyve/mevent_test.c
204 ↗(On Diff #44960)

errx vs err Not gona list any more of these, but please address them.

Thank you! Looks good to me, aside of mentioned by Rod err()/errx() issues. All perror()'s should probably turn into err() instead of errx().

usr.sbin/bhyve/bhyverun.c
1054 ↗(On Diff #44960)

Wouldn't it be better as: "device emulation initialization error"?

usr.sbin/bhyve/pci_e82545.c
2227 ↗(On Diff #44960)

Not directly to this patch, but in general, it would be good to get some consistency between different drivers whether exit immediately or report initialization error.

  • Address @rgrimes and @mav comments.
  • Where we used to have perror(1) now we use err(3) with the global variabel errno intro(2).

The global errno where tested with the vm reinit:
root@mp03:/z/vms/freebsd # bhyve -A -H -P -c 6 -m 8098M -s 0:0,hostbridge -s 31,lpc -l com1,/dev/nmdm1A -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd -s 29,fbuf,tcp=10.20.21.2:6032,w=800,h=600 -s 30,xhci,tablet -s 4,virtio-scsi,dev=/dev/cam/ctl1.1,iid=1 -s 5,virtio-net,tap0,mac=00:a0:98:2c:d5:9f FreeBSD
bhyve: vm_reinit: Device busy
root@mp03:/z/vms/freebsd # echo $?
16

  • Use EX_OSERR instead of EX_NOPERM slipped in by mistake.

This is also going to require a man page update, as now there are more than the 4 exit codes listed in the man page.

usr.sbin/bhyve/bhyverun.c
853 ↗(On Diff #45048)

Not sure how I missed this the first time, but this is not a temp fail either. Probably EX_USAGE.

Looks good over all, just some nits on errx vs err when it appears that there is a proper errno avaliable.

I am not so sure on using sysexits.h either, would rather have a bhyve specific list of error codes as is currently implemented (1 to 4), only expanded to include the any many error conditons that it can exit on.

I have hated tracing down why I get an exit(1) in the code, though this does improve on that some, most of them are just going to be a small set of sysexits codes, and it would greatly help if each exit code was unique and pointed to a very specific problem.

The idea of this patch is make the exit code 1 be more unique to 'powered off', although I agree that we need more unique exit codes for each specific problem. I think to achieve a full and more useful exit code we shall need a generic error interface implementation where we can use among all bhyve drivers. However to make that happens it will takes me some time to audit each driver and make them more consistent with error codes.

  • Use EX_USAGE instead of EX_TEMPFAIL reported by @rgrimes, I have overlooked this one.

This is also going to require a man page update, as now there are more than the 4 exit codes listed in the man page.

I don't think we really need that, because those are the main error codes that users should look at. I do believe later as soon as we have a better and full error code approach then we deserve a manpage update.

Best,

err(errno) is probably not going to work well. For example ENOENT is 2 which maps to "2 halted" in the manpage.

I would rather avoid the use of sysexits(3) as it is not portable and it is actually mostly not used in FreeBSD. I would suggest just adding a new exit code for these errors that is documented in the manpage, e.g. "4 failed to start" and change all these to use that.

usr.sbin/bhyve/bhyverun.c
880 ↗(On Diff #44960)

I doubt errno is a good exit value to use.

  • Address @jhb suggestion.
  • Update bhyve(8).

Oof, I failed to realize that some of these errors are not startup related, so maybe just '4 exited due to an error' in the manpage. I do think the earlier changes to use err() were also good changes, I would just stick with '4' for the exit value. However, if you want to commit this as-is for now and adopt err() in place of fprintf/perror + exit as a later change that is ok with me as well. This is fine with me with the manpage tweak (or some similar tweak).

This revision is now accepted and ready to land.Jul 10 2018, 9:17 PM
In D16161#343831, @jhb wrote:

Oof, I failed to realize that some of these errors are not startup related, so maybe just '4 exited due to an error' in the manpage. I do think the earlier changes to use err() were also good changes, I would just stick with '4' for the exit value. However, if you want to commit this as-is for now and adopt err() in place of fprintf/perror + exit as a later change that is ok with me as well. This is fine with me with the manpage tweak (or some similar tweak).

OK, I will uppate the manpage and commit it as-is just to make the exit(1) more unique for now. I have intention to revisit it and improve all the exit returns. I will replace fprintf/perror + exit as soon as I finish the drivers analysis to make sure what can be improved as error, exit and return code.

Thanks @jhb

  • Tweak the bhyve(8) to reflect the general error code 4.
This revision now requires review to proceed.Jul 11 2018, 3:11 AM
This revision was not accepted when it landed; it landed in state Needs Review.Jul 11 2018, 3:23 AM
This revision was automatically updated to reflect the committed changes.