Page MenuHomeFreeBSD

Introduce geom_slow to simulate slow disks
Needs ReviewPublic

Authored by allanjude on Feb 9 2017, 3:22 AM.

Details

Reviewers
scottl
ken
trasz
dvl
imp
asomers
mav
pjd
Group Reviewers
manpages
Summary

Creates an overlay device, similar to gnop, with rate and IOPS limiting

Requested By: Dan Langille

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 7296
Build 7464: arc lint + arc unit

Event Timeline

allanjude retitled this revision from to Introduce geom_slow to simulate slow disks.
allanjude updated this object.
allanjude edited the test plan for this revision. (Show Details)
allanjude added reviewers: imp, pjd, asomers, mav, trasz, ken.

This looks cool and all, but I wrote one and added it to the geom_sched framework about a year ago...
Though this one arguably has more knobs on it.

Why a separate device rather than integrating this with the other features of gnop?

Warner and Warren are both right - this functionality would fit in well in either g_sched, or as part of g_nop.

That having been said, I skimmed this anyway. In addition to the inline comments, I also note that the manpage uses "m" as a scaling factor, when it should probably be "M". For that matter, it's not clear to be where the scaling is actually done...?

sys/geom/slow/g_slow.c
93

(x < 0 || x > 0) -> (x != 0); ==>

if (lasttime->tv_sec == 0 || delta.tv_sec != 0)

128

Should these be sc_iops_time?

214

Remove

284

Since you got sc with M_ZERO, you don't need to explicitly set the counters to zero.

Nevermind, I see you picked this up from g_nop. Arguably, it's unnecessary there too. :-)

sys/geom/slow/g_slow.c
134

I'm a bit concerned about using what's effectively a sleep to pace the i/o. By using pause() here, you're going to put the I/O completion context to sleep, stranding any other work that it might want to do. If this is just an MD disk, I think it's mostly ok because you'll just starve the MD kthread for the device, and there's a kthread per device. If you put this on real hardware though, it's going to put the ithread of that hardware, or the camisr completion thread, to sleep, which can potentially impact all disks on that hardware, not just the target disks. As soon as you've called pause() here, any other I/O that shares this execution context is going to be stranded until it wakes up. Imagine a scenario where you have an LSI controller and 3 disks. The first disk, disk0, is your boot/rootfs, and the other two, disk1 and disk2, are your testing disks. One camisr thread will be created, and that thread will be the execution context for the g_up processing for all of the disks. Once this GEOM class starts pacing disk 1 and 2, disk0 is going to also be paced because of the shared upcall context. This probably isn't what you want. It'll also impact the ability to pace the disks at different rates, leading to some very unpredictable behavior. I'm not opposed to experimental work like this, but I'm concerned that it'll only work correctly in a very narrow case, and be otherwise frustrating and unpredictable. Using this on NVME with multi-queue will likely be chaos. Can this be implemented as a policy module for g_sched instead?