Introduce geom_slow to simulate slow disks
Needs ReviewPublic
Actions

Authored by allanjude on Feb 9 2017, 3:22 AM.

Details

Reviewers

scottl
ken
trasz
dvl
imp
asomers
mav
pjd

Group Reviewers

manpages

Summary

Creates an overlay device, similar to gnop, with rate and IOPS limiting

Requested By: Dan Langille

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 7296
Build 7464: arc lint + arc unit

Event Timeline

allanjude updated this revision to Diff 24903.Feb 9 2017, 3:22 AM

allanjude retitled this revision from to Introduce geom_slow to simulate slow disks.

allanjude updated this object.

allanjude edited the test plan for this revision. (Show Details)

allanjude added reviewers: imp, pjd, asomers, mav, trasz, ken.

Herald added a reviewer: manpages. · View Herald TranscriptFeb 9 2017, 3:22 AM

allanjude added a reviewer: dvl.Feb 9 2017, 3:22 AM

This looks cool and all, but I wrote one and added it to the geom_sched framework about a year ago...
Though this one arguably has more knobs on it.

Why a separate device rather than integrating this with the other features of gnop?

Warner and Warren are both right - this functionality would fit in well in either g_sched, or as part of g_nop.

That having been said, I skimmed this anyway. In addition to the inline comments, I also note that the manpage uses "m" as a scaling factor, when it should probably be "M". For that matter, it's not clear to be where the scaling is actually done...?

sys/geom/slow/g_slow.c
93	(x < 0 \|\| x > 0) -> (x != 0); ==> if (lasttime->tv_sec == 0 \|\| delta.tv_sec != 0)
128	Should these be `sc_iops_time`?
214	Remove
284	Since you got `sc` with `M_ZERO`, you don't need to explicitly set the counters to zero. Nevermind, I see you picked this up from `g_nop`. Arguably, it's unnecessary there too. :-)

asomers added a reviewer: scottl.Feb 22 2017, 8:10 PM

scottl added inline comments.Feb 22 2017, 11:09 PM

sys/geom/slow/g_slow.c
134	I'm a bit concerned about using what's effectively a sleep to pace the i/o. By using pause() here, you're going to put the I/O completion context to sleep, stranding any other work that it might want to do. If this is just an MD disk, I think it's mostly ok because you'll just starve the MD kthread for the device, and there's a kthread per device. If you put this on real hardware though, it's going to put the ithread of that hardware, or the camisr completion thread, to sleep, which can potentially impact all disks on that hardware, not just the target disks. As soon as you've called pause() here, any other I/O that shares this execution context is going to be stranded until it wakes up. Imagine a scenario where you have an LSI controller and 3 disks. The first disk, disk0, is your boot/rootfs, and the other two, disk1 and disk2, are your testing disks. One camisr thread will be created, and that thread will be the execution context for the g_up processing for all of the disks. Once this GEOM class starts pacing disk 1 and 2, disk0 is going to also be paced because of the shared upcall context. This probably isn't what you want. It'll also impact the ability to pace the disks at different rates, leading to some very unpredictable behavior. I'm not opposed to experimental work like this, but I'm concerned that it'll only work correctly in a very narrow case, and be otherwise frustrating and unpredictable. Using this on NVME with multi-queue will likely be chaos. Can this be implemented as a policy module for g_sched instead?