Page MenuHomeFreeBSD

libuvmem: usermode port of vmem(9)
Needs ReviewPublic

Authored by kib on Nov 14 2020, 10:13 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Oct 16, 9:24 AM
Unknown Object (File)
Sun, Oct 12, 10:50 PM
Unknown Object (File)
Sun, Oct 12, 3:27 PM
Unknown Object (File)
Fri, Oct 10, 10:54 AM
Unknown Object (File)
Wed, Oct 8, 7:18 AM
Unknown Object (File)
Mon, Oct 6, 4:03 AM
Unknown Object (File)
Fri, Oct 3, 12:49 AM
Unknown Object (File)
Fri, Oct 3, 12:24 AM

Details

Summary

The quantum cache is disabled, there is no uma.

Intent is to use this for resource allocation in bhyve(8), for start. Addition of -luvmem to bhyve linking was done to test changes to share/mk.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
kib requested review of this revision.Nov 14 2020, 10:13 PM
grehan added a subscriber: grehan.

No objection from bhyve's PoV.

Seems ok to me, just nitpicking.

sys/kern/subr_vmem.c
608

Commit separately?

812

I would delete the rehash comment and merge the two blocks. All of the functions below can arguably be called "vmem internal functions".

818

This can be a separate commit.

848–849

This can be a separate commit.

1363

Why?

1441

I really dislike conditionally defined top-level local variables, so I'd prefer to move the definition into the if block below. I don't insist on it though.

1574

You could eliminate some inline ifdefs with #define panic(...) assert(0).

1582–1583

Same comment as above wrt local variables.

1598

This can be a separate commit.

1694

This block should probably be formally protected with ifdef _KERNEL as well.

This revision is now accepted and ready to land.Nov 16 2020, 4:41 PM
sys/kern/subr_vmem.c
1441

I agree in general, and if this were some special case (e.g. INVARIANTS) instead of the usual case I'd argue strongly for markj's take.

kib marked 7 inline comments as done.Nov 17 2020, 1:32 AM
kib added inline comments.
sys/kern/subr_vmem.c
1363

To not add __unused declaration, I think it is logically wrong.

1574

There were only two of them so I did not bothered.

This revision was automatically updated to reflect the committed changes.
kib marked 2 inline comments as done.
kib updated this revision to Diff 79634.

Rebase after commit of warning fixes.
Handle Mark notes.

markj added inline comments.
sys/contrib/openzfs/include/sys/zfs_context.h
53

Is this just a reminder to upstream the change? Or is there another problem?

sys/kern/subr_vmem.c
1363

I think (void)qcache_max; is more common for this scenario.

This revision is now accepted and ready to land.Nov 17 2020, 2:38 PM

Add M_*FIT bits to sys/vmem.h for userspace.

This revision now requires review to proceed.Nov 19 2020, 5:37 PM

Hi, is there a concrete reason why this revision never landed or has it just slipped through the cracks?

I would really like to use libuvmem for one of my ongoing bhyve projects and would like to see this land, let me know if there a way I could help with that.

Hi, is there a concrete reason why this revision never landed or has it just slipped through the cracks?

I would really like to use libuvmem for one of my ongoing bhyve projects and would like to see this land, let me know if there a way I could help with that.

The intent of the libuvmem port was to use vmem allocator for bhyve to handle physical address space. That stalled, and I do not see a reason to add unused library to base.
If you need it, I am more than happy to push the commit.

Meantime I rebased the branch.

Rebase. Use FreeBSD 16.0 symbol version for the new lib.

Remove undefined symbols from the version script. The library should be buildable, at least.

Use standard way to reference libc src dir.
Silence compiler warning in _bt_fill() about flags.

sys/contrib/openzfs/include/sys/zfs_context.h
53

AFAIR it did not work, there was some conflict because solaris shims also provided sys/vmem.h.

sys/kern/subr_vmem.c
1425

Maybe add macros for these list locks, like we have for other locks?

kib marked an inline comment as done.

VMEM_LIST_LOCK()

kib marked 6 inline comments as done.Tue, Oct 14, 5:23 PM
In D27220#1212532, @kib wrote:

Hi, is there a concrete reason why this revision never landed or has it just slipped through the cracks?

I would really like to use libuvmem for one of my ongoing bhyve projects and would like to see this land, let me know if there a way I could help with that.

The intent of the libuvmem port was to use vmem allocator for bhyve to handle physical address space. That stalled, and I do not see a reason to add unused library to base.

Is there a revision with the stalled patches somewhere? I'd be interested in taking a look.

If you need it, I am more than happy to push the commit.

That would be amazing, thank you!
It would really help with managing PCI BAR allocations for my ongoing device hotplugging work.

In D27220#1212532, @kib wrote:

Hi, is there a concrete reason why this revision never landed or has it just slipped through the cracks?

I would really like to use libuvmem for one of my ongoing bhyve projects and would like to see this land, let me know if there a way I could help with that.

The intent of the libuvmem port was to use vmem allocator for bhyve to handle physical address space. That stalled, and I do not see a reason to add unused library to base.

Is there a revision with the stalled patches somewhere? I'd be interested in taking a look.

I suspect none survived, since I only found this libuvmem diff itself on that branch.

If you need it, I am more than happy to push the commit.

That would be amazing, thank you!
It would really help with managing PCI BAR allocations for my ongoing device hotplugging work.

Yes this is exactly the place where libuvmem should fit perfectly.

Looks ok to me. Perhaps this should be a private library instead?

lib/libuvmem/Makefile
2

Extra newline.

8
This revision is now accepted and ready to land.Wed, Oct 15, 1:18 PM
kib marked 2 inline comments as done.Wed, Oct 15, 1:45 PM

Looks ok to me. Perhaps this should be a private library instead?

IMO it is generally useful API. Also, I do provide symbol versioning, which does not make sense for a private library: the intent is to keep stable interfaces.

This revision now requires review to proceed.Wed, Oct 15, 1:45 PM
In D27220#1213306, @kib wrote:

Looks ok to me. Perhaps this should be a private library instead?

IMO it is generally useful API. Also, I do provide symbol versioning, which does not make sense for a private library: the intent is to keep stable interfaces.

Well, now the kernel's internal interface must be kept stable as well. I don't object, but I'm worried it'll be easy for someone modifying vmem to forget about this.

If the library is public, it should get a man page IMO, even if it just points at vmem(9).

In D27220#1213306, @kib wrote:

Looks ok to me. Perhaps this should be a private library instead?

IMO it is generally useful API. Also, I do provide symbol versioning, which does not make sense for a private library: the intent is to keep stable interfaces.

Well, now the kernel's internal interface must be kept stable as well. I don't object, but I'm worried it'll be easy for someone modifying vmem to forget about this.

Yes, I noted the same when parts of msdosfs were used in makefs (AFAIR). But there it is much simpler interface, and if the interface for kernel or userspace starts drifting, the code can be easily forked to the residual part.

In D27220#1213582, @kib wrote:
In D27220#1213306, @kib wrote:

Looks ok to me. Perhaps this should be a private library instead?

IMO it is generally useful API. Also, I do provide symbol versioning, which does not make sense for a private library: the intent is to keep stable interfaces.

Well, now the kernel's internal interface must be kept stable as well. I don't object, but I'm worried it'll be easy for someone modifying vmem to forget about this.

Yes, I noted the same when parts of msdosfs were used in makefs (AFAIR). But there it is much simpler interface, and if the interface for kernel or userspace starts drifting, the code can be easily forked to the residual part.

My point is that no automation (e.g., tests) will catch this drift until after it is a problem for some hypothetical users. If makefs -t msdos is broken somehow, we will notice fairly quickly.

If Bojan's work lands in bhyve, maybe that will be enough.

lib/libuvmem/libuvmem.3
48
64

You might note that the allocator is thread-safe.

This revision is now accepted and ready to land.Thu, Oct 16, 1:27 PM

One minor thing I've noticed is that we don't export M_NOWAIT.
I think it would be very useful to have a way of controlling whether we wait or exit on failure, but I'm not sure if accomplishing that is as straightforward as defining M_NOWAIT in sys/vmem.h.

sys/kern/subr_vmem.c
1354–1355

I hit an error caused by this pretty early on.

Because the assertion above enforces that the quantum should always be 0, the resulting quantum mask will be set to -1 (all F's) which will cause vmem_roundup_size to always return 0.
This locks vmem_xalloc into either an infinite loop or causes it to block indefinitely.

kib marked 2 inline comments as done.Fri, Oct 17, 12:01 AM

One minor thing I've noticed is that we don't export M_NOWAIT.
I think it would be very useful to have a way of controlling whether we wait or exit on failure, but I'm not sure if accomplishing that is as straightforward as defining M_NOWAIT in sys/vmem.h.

It should be, at least lets try it that way. I added M_NOWAIT into !_KERNEL block of sys/vmem.h. Also I inspected subr_vmem.c and M_WAITOK seems to be only used by the kernel blocks.

sys/kern/subr_vmem.c
1354–1355

Thanks. I handled it differently, forcing quantum = 1 in the initial ifdef block in vmem_init().

Set quantum to 1.
Allow M_NOWAIT
List flags for vmem_xalloc() in man page.
Mention thread-safety in man page.
Remove unused VMEM_TRYLOCK()

This revision now requires review to proceed.Fri, Oct 17, 12:02 AM