The seg_max value reported to the guest should be two less than the host's maximum, in order to leave room for the request and the response.
We hit the "too many segments to enqueue" assertion on OneFS because we increased MAXPHYS to 256 KiB.
Differential D20529
bhyve: fix reporting of virtio scsi seg_max vangyzen on Jun 5 2019, 9:43 PM. Authored by Tags None Referenced Files
Details
The seg_max value reported to the guest should be two less than the host's maximum, in order to leave room for the request and the response. We hit the "too many segments to enqueue" assertion on OneFS because we increased MAXPHYS to 256 KiB. OneFS now boots in bhyve, where it always panicked during boot before this change. FreeBSD still works, as before.
Diff Detail
Event TimelineComment Actions What is the panic message? I believe seg_max is the number of segments in a SCSI command while vtscsi_maximum_segments computes the maximum scatter-gather entries in a request enqueued to the virtqueue. The VTSCSI_MIN_SEGMENTS accounts for the SG entries needed for the request and response (virtio_scsi_cmd_{req,resp}). The return value is also used to configure the CAM maxio size in vtscsi_cam_path_inquiry. If nothing else, this change is incomplete because a small enough seg_max value will cause a bogus maxio to be calculated. Does booting your increased MAXPHYS kernel on QEMU/KVM also panic? If you change the config seg_max in bhyve to be VTSCSI_MAXSEG - 2, does that fix the issue? Comment Actions "Something something 65 64/0"
No -- but I believe this is only because QEMU's default number of segments is 128, while Bhyve's is 64. (256 kB MAXPHYS / PAGE_SIZE => 64 segments.) Comment Actions
Oh! Thanks for that explanation.
I'll try that now.
Essentially, yeah: virtqueue_enqueue: vtscsi0 request - too many segments to enqueue: 65, 64/0 Comment Actions
Ah, makes sense: bhyve vtscsi sets the virtqueue size to be the same as the maximum segment size (and I guess indirect descriptors aren't negotiated). I suppose there could be a refactor in the driver to further cap the seg_max by the queue size (or fail to attach since it is a device bug), although this device has multiple virtqueues so that might artificially limit it. IIRC there is some verbiage in the spec that basically says "don't be stupid around virtqueue descriptor sizes". A quick look at the NetBSD, OpenBSD, and Linux VirtIO SCSI drivers and all of them honor seg_max as is. So this is probably only worth fixing in bhyve, perhaps with a comment or compile-time assert. Comment Actions
That worked. Thanks for the further explanation. I'll try to word that into a comment later this morning, since a bare foo - 2 clearly needs one. Comment Actions Wait, I think we should still have a seat belt in guests. Eg it is much easier to update a single guest than a hypervisor host. And does it make sense to limit to sub-256kB ios when linux qemu allows 512kB? Comment Actions
I agree, especially for Isilon's usage. I've been distracted most of the morning. I'll look at which seatbelt makes the most sense.
I don't know of a reason, but I'm not familiar with this area. @bryanv? Comment Actions Please document why the -2 is needed per the earlier discussion. Does this need urgent attention to get in 11.3? Comment Actions So adding the -2 seems very similar to the change made to virtio-block in https://svnweb.freebsd.org/base?view=revision&revision=347033 which also has a -2 in its seg_max.
Comment Actions
In my opinion, no. To my knowledge, I'm the only person to report this, and I only hit it by running OneFS under bhyve, which uses a larger MAXPHYS. Comment Actions Note that there are other FreeBSD consumers who use large MAXPHYS (even larger than 256k), although I don't know if they run as Bhyve guests (Netflix, some tape filesystem folks, ...) (probably not). Comment Actions
I'm not very familiar with bhyve VirtIO SCSI so not sure if was arbitrary or because of a CTL limit.
|