Details

Reviewers

mav
freqlabs

Group Reviewers

bhyve
manpages

Summary

To be able to plugin different storage systems an abstraction
layer for the block storage is "required".

This is currently implemented in the existing block_if.{hc}
files.
The current callee access routings have their prefixes renamed
from blockif_ to blocklocal_.
And have moved into a separate pair of files block_local.{hc}
Functional there should not be an change, other that the
calls to the storage API now go thru a an extra set of tabled
functions.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 30036
Build 27847: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Remove bsdinstall diffs that got in this set by accident.

Harbormaster completed remote builds in B28875: Diff 67203.Jan 23 2020, 3:50 PM

Remove struct blockif_ctxt *bc from struct pci_vtblk_softc.
If one needs block device info, you can get it thru the backend-descriptior

fixed a few more compile errors.

Harbormaster completed remote builds in B28876: Diff 67204.Jan 23 2020, 4:00 PM

Reversed the way the backend selector is integrated in blockif.
Now every blockif_ctxt is extended with a reference to a backend provider.

Harbormaster completed remote builds in B28939: Diff 67340.Jan 26 2020, 11:51 PM

wjw_digiware.nl edited the summary of this revision. (Show Details)Jan 26 2020, 11:54 PM

I'll be taking a closer look when I get a chance but at a glance at the diff this looks like a much cleaner change now. Thank you for taking the time to work on it!

bhyve/blocklocal: do external refence to init() and cleanup()

init() and cleanup() are calls that are only refered to from with
the actual backend. So block_if does not need to know them.

Harbormaster completed remote builds in B28946: Diff 67353.Jan 27 2020, 2:09 PM

freqlabs added inline comments.Feb 6 2020, 1:53 AM

usr.sbin/bhyve/block_if.c
70–71	Indentation looks different than in the rest of the file.
152	`!= NULL` please to avoid confusion about the type.
171	This isn't the only place I see this, but I'll just point it out here: Minor style nit - `Values in return statements should be enclosed in parentheses.` return (bc->be->close(bc));
usr.sbin/bhyve/block_if.h
141	Please prefix the struct members with `bb_`.
182	I think it would be a bit simpler to use an opaque pointer for private data.

Ah, Willem, this week we have SUSE Hack Week, and I was not aware that you had already been working on rbd driver for bhyve, and picked this as my Hack Week project. And just today accidentally (after your comment to my qemu review) I noticed your review request. I already have some working code [1]. It also adds an abstraction layer in a way similar to yours and has a working driver. It still needs some work: right now librados and librbd from ports are just linked to bhyve, which is not acceptable for upstream, I think we need some solution to try dynamically load the libraries (or even rbd driver) at run time. Anyway I already tried it and it looks like work.

So it looks like unfortunately we did duplicate work. But I am glad you are working on it and would be happy if your version goes upstream, as you started to work on it early. You could look at my version for some ideas though. As it already has rbd driver it might be clearer for reviewers what is the goal of adding this abstraction.

[1] https://github.com/freebsd/freebsd/compare/master...trociny:bhyve-rbd

In D23010#519020, @trociny wrote:

Ah, Willem, this week we have SUSE Hack Week, and I was not aware that you had already been working on rbd driver for bhyve, and picked this as my Hack Week project. And just today accidentally (after your comment to my qemu review) I noticed your review request. I already have some working code [1]. It also adds an abstraction layer in a way similar to yours and has a working driver. It still needs some work: right now librados and librbd from ports are just linked to bhyve, which is not acceptable for upstream, I think we need some solution to try dynamically load the libraries (or even rbd driver) at run time. Anyway I already tried it and it looks like work.

So it looks like unfortunately we did duplicate work. But I am glad you are working on it and would be happy if your version goes upstream, as you started to work on it early. You could look at my version for some ideas though. As it already has rbd driver it might be clearer for reviewers what is the goal of adding this abstraction.

[1] https://github.com/freebsd/freebsd/compare/master...trociny:bhyve-rbd

Great, I was indeed wondering where the QEMU fix al of a sudden caem from, and when you'd do something for bhyve.

Well. if nothing else, your work is also the "proof" that we at least should have something like a block-device multiplexer.
And I'm usually submitting work in small steps to give reviewers the oportunity to get a good feel of the code, without
too much distraction of putting 2 problems in 1 PR.

Last three weeks have been rather busy with $JOB, so have not yet adressed the comments here,
And getting things commited does not have the same pace as the Ceph PRs have.
But I'll have time this weekend to work on my Ceph stuff. And I'll look at yout Git as well.

And yes, in the end I thing it will require something of a DL-type of loading this "driver".
Perhaps create a directory like /usr/libexec/bhyve/blocks and load block_<driver>.so if there is
a reference to the driver on the PCI specification in the command-line.

trociny added inline comments.Feb 12 2020, 3:50 PM

usr.sbin/bhyve/block_if.c
139	BTW, in my version I just check `optstr` for a prefix ending with semicolon (e.g. 'rbd:'), and then use this prefix to find the driver.

usr.sbin/bhyve: Addressed comments from @ryan_freqlabs.com

Harbormaster completed remote builds in B29372: Diff 68312.Feb 14 2020, 11:15 AM

wjw_digiware.nl added inline comments.Feb 14 2020, 11:15 AM

usr.sbin/bhyve/block_if.c
139	@trociny I think it is beter actually to leave this up to the backends themselfs, this creates more flexibility. And would for example allow for multiple `/........` backends. Where if the basic local backend refuses, there could be other backends working on this `optstr` that can handle the request.
171	I have not fixed this in files I have not touched.

trociny added inline comments.Feb 14 2020, 2:52 PM

usr.sbin/bhyve/block_if.c
139	I can hardly imagine how backends are going to detect if its "their" device, if only they would not rely on some prefix name. And if they would why not to do this in generic way instead of implementing this in every driver? So, bhyve would look for provided disk name, and if it had "foo:" prefix, it would assume that the driver is "foo", and would try to open the device using this driver (if it had been successfully loaded.

freqlabs added inline comments.Feb 14 2020, 2:59 PM

usr.sbin/bhyve/block_if.c
139	I agree with `scheme:` suggest, ie `file://`, `rbd://`, `nbd://`, etc. with a fallback to the file backend if no scheme is given. Each backend can register its scheme name and we can directly pick the right one with no ambiguity.

freqlabs added inline comments.Feb 14 2020, 3:02 PM

usr.sbin/bhyve/block_if.c
139	Oops, I meant to remove `//` from the rest of those before I submit. Anything after the `:` would be passed to the backend.

wjw_digiware.nl added inline comments.Feb 14 2020, 3:25 PM

usr.sbin/bhyve/block_if.c
139	Right, and perhaps keep the `/......` for legacy defaulting to local store.
139	@trociny I guess it would be possible to add a scheme descriptor to the backend struct, and only go with that. But then still you'd have to call `bb_open()` to see it the `optstr` is a valid string and attach the device.

usr.sbin/bhyve: Add bb_sche to a backend to specify which backend to use

Harbormaster completed remote builds in B29378: Diff 68323.Feb 14 2020, 4:12 PM

usr.sbin/bhyve: Added possible scheme to the man page

Harbormaster completed remote builds in B29380: Diff 68325.Feb 14 2020, 4:28 PM

trociny added inline comments.Feb 14 2020, 7:40 PM

usr.sbin/bhyve/block_if.c
117	Note, bb_open may return NULL here and you will be dereferencing NULL
137	What is a point of passing the scheme prefix to the driver? The first step that it would need to do is to trim it. Why not instead always pass to the bb_open optstr without scheme. It is easy to trim it from the optstr that has it -- just increase optstr by the prefix length.
usr.sbin/bhyve/block_if.h
143	nit: I would call it `bb_name` and store the backend name without ending ':' ("file", "rbd").

wjw_digiware.nl added inline comments.Feb 14 2020, 8:26 PM

usr.sbin/bhyve/block_if.c
117	good call
137	Yeah, that is the other way around ..... Makes it easier here, but a bit more code in the `FOR loop`
usr.sbin/bhyve/block_if.h
143	Sure, can do that makes the code only more convoluted. what if `optstr` contains `/file/`, so it will require checking if `strstr` returns start of `optstr` AND that `bb_name != ""` And then we need still to go out and find the ":" But I'll give it a shot.

trociny added inline comments.Feb 14 2020, 8:52 PM

usr.sbin/bhyve/block_if.c
137	Otherwise you would have "a bit mode code" in every driver to parse the prefix.
usr.sbin/bhyve/block_if.h
143	I am not quite sure I follow. I suggest not to store ':' in the backend name, but sure the ':' should be present in `optstr` to separate scheme (backend) name from the rests of params. When parsing `optstr` for the backend name you should look for the first occurrence of ':' character and use the part of the string before ':' as a backend name if all characters there are alphanumeric. And the part of the string after ':' is passed to the backend open function (if you found the backend with such a name).

usr.sbin/bhyve: Update the backend selection code

Harbormaster completed remote builds in B29388: Diff 68338.Feb 14 2020, 9:24 PM

usr.sbin/bhyve: Make the legacy case operational

Harbormaster completed remote builds in B29389: Diff 68339.Feb 14 2020, 10:10 PM

@trociny
BTW: I'm running my tests like

bhyve-rbd -H -P -A -c 4 -m 2G -l com1,/dev/nmdm1213A -s 0,hostbridge -s 1,lpc -s 2,virtio-net,tap1213 -s 3,ahci-hd,file:/dev/ggate0 -s 4,ahci-hd,/dev/ggate1 FBSD1213

So we are already running of Ceph disks, be it in an odd way.

freqlabs added inline comments.Feb 23 2020, 3:15 AM

usr.sbin/bhyve/block_if.c
75	If you null out the ':' before moving the pointer forward then you can use strcmp below to ensure we get an exact match for the backend name. *optrest = '\0'; optrest++;
79–80	This comment is outdated.
101–102	We can `break` if open fails. There should be only one backend with the given name.

usr.bin/bhyve: implement code improvements

Harbormaster completed remote builds in B29602: Diff 68804.Feb 25 2020, 3:02 PM

I guess I addresses a suggestions thusfar.

This is getting really close to the finish line.

usr.sbin/bhyve/block_if.h
182	`void *` is more appropriate for private data.
usr.sbin/bhyve/block_local.c
811	These names seem backwards to me. What do you think about this instead? .bb_name = "blk-local", .bb_scheme = "file", Then name would be the thread name and scheme would be the scheme name for matching the backend.

This revision now requires changes to proceed.Feb 25 2020, 3:25 PM

usr.sbin/bhyve: change opague pointer, and rewrote to bb_threadinfo

Harbormaster completed remote builds in B29606: Diff 68808.Feb 25 2020, 3:46 PM

wjw_digiware.nl marked an inline comment as done.Feb 25 2020, 3:51 PM

wjw_digiware.nl added inline comments.

usr.sbin/bhyve/block_local.c
811	This you will have to fight with @trociny, since he did not like the name `scheme`. Note that ATM, `bb_threadinfo` is not used, since we only use what the parent inserted. So it is things like `"%d:%d", pi->pi_slot, pi->pi_func that set the threadname.

trociny added inline comments.Feb 25 2020, 4:18 PM

usr.sbin/bhyve/block_local.c
811	My point actually was why have both bb_name\|prefix and bb_scheme? Why don't just have one param: bb_name, so it is both the thread name and scheme name, whatever. It looks for me perfectly well if one have to specify a device like 'blk-local:/path/to/dev_or_file'

freqlabs accepted this revision.Feb 26 2020, 9:08 AM

freqlabs added inline comments.

usr.sbin/bhyve/block_local.c
811	I had not noticed that bb_threadinfo isn't used by the blockif abstraction. That being the case, just get rid of it. If a backend needs to store some thread info for its own threads, it can keep it in the private data. With that out of the way I don't mind much whether the remaining member is called bb_name or bb_scheme.