Page MenuHomeFreeBSD

linuxkpi: Implement extensible arrays API using existing radix tree.
ClosedPublic

Authored by hselasky on Jun 2 2020, 12:46 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Dec 23, 4:43 AM
Unknown Object (File)
Sun, Dec 22, 3:58 AM
Unknown Object (File)
Thu, Dec 19, 5:53 AM
Unknown Object (File)
Wed, Dec 18, 3:38 PM
Unknown Object (File)
Sat, Dec 14, 4:52 PM
Unknown Object (File)
Fri, Dec 13, 1:41 AM
Unknown Object (File)
Sun, Dec 8, 8:51 AM
Unknown Object (File)
Sun, Dec 8, 4:34 AM

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Fix inverted xa_empty() return value.

So what/where is the API specification ?
What are the synchronization guarantees ?
Are there any peculiarities in the implementation worth mention ?

What are the synchronization guarantees?

All writing is done under a mutex.
All reading is done under RCU.

That's it.

What are the synchronization guarantees?

All writing is done under a mutex.
All reading is done under RCU.

That's it.

I mean, what is the API user contract for locking.

Can you use FreeBSD primitives instead of Linux wrappers ? I mean atomics/mutexes/epochs.

sys/compat/linuxkpi/common/src/linux_radix.c
306 ↗(On Diff #72564)

In other place you also free root node if it has zero children count.

331 ↗(On Diff #72564)

Style: space before '('.

sys/compat/linuxkpi/common/src/linux_xarray.c
60 ↗(On Diff #72564)

This increment lacks an assert that the operation did not overflow.

70 ↗(On Diff #72564)

The check for underflow is needed, even if racy.

76 ↗(On Diff #72564)

Style, no initialization in definition.

Why this recursion is possible (explanation should be a comment) ? Is it because user could lock xa and then call the API ?

That said, am I right that API distinguish between unlocked and locked entry points ? mtx_owned() generally should not be used, esp. when the code easily knows the state by contract.

108 ↗(On Diff #72564)

What state is kept 'AS-IS' ?

I see, at least, that all readers are forced to use mutex while we sleep, is it needed ?

hselasky added inline comments.
sys/compat/linuxkpi/common/src/linux_radix.c
306 ↗(On Diff #72564)

I will have to look at this.

sys/compat/linuxkpi/common/src/linux_xarray.c
76 ↗(On Diff #72564)

The API is not clear about locked and unlocked and I think we need to handle both, looking at the driver code we need to support.

The problem is that tree manipulations are allowed if flags do not allow sleeping. I can assert that.

108 ↗(On Diff #72564)

I don't think so. Let me analyze.

hselasky marked 2 inline comments as done.

Address comments from @kib

sys/compat/linuxkpi/common/src/linux_radix.c
331 ↗(On Diff #72564)

Fixed.

Add some more comments explaining why memory cleanup is not needed when inserting radix tree nodes and a failure happens.

sys/compat/linuxkpi/common/src/linux_radix.c
306 ↗(On Diff #72564)

I'm adding a comment here to explain the logic and why it is not leaking any memory this way.

sys/compat/linuxkpi/common/include/linux/xarray.h
48 ↗(On Diff #75293)

Can you please write it in more proper way, 0xffffffff ? If I guess the meaning of the abbreviation 32b correctly.

sys/compat/linuxkpi/common/src/linux_radix.c
229 ↗(On Diff #75293)

I do not think exclamation mark is warranted in comments.

292 ↗(On Diff #75293)

We usually provide full sentences even in single-line comments.
I.e. start it with capital letter, end with dot.

sys/compat/linuxkpi/common/src/linux_xarray.c
47 ↗(On Diff #75293)

You ignored my notes from previous pass.

hselasky added inline comments.
sys/compat/linuxkpi/common/src/linux_radix.c
292 ↗(On Diff #75293)

Will update this comment. Trying to follow the style in this file.

sys/compat/linuxkpi/common/src/linux_xarray.c
47 ↗(On Diff #75293)

Can you repeat? I've added the assert you requested.

hselasky marked 2 inline comments as done.

Address comments from kib@

Correctly place misplaced underscore.

sys/compat/linuxkpi/common/src/linux_xarray.c
47 ↗(On Diff #75293)

I do not want to retype all my comments.

One of the biggest one was the request to use FreeBSD primitives directly instead of layering LinuxKPI over LinuxKPI.

hselasky edited the summary of this revision. (Show Details)

Use FreeBSD native API's when possible as suggested by @kib

Unfortunately we still have to use the LinuxKPI's RCU, because the EPOCH requires an epoch-tracker, which is not part of the xarray API.

Maybe this can be added to the task_struct structure ...

Use FreeBSD native API's when possible as suggested by @kib

Unfortunately we still have to use the LinuxKPI's RCU, because the EPOCH requires an epoch-tracker, which is not part of the xarray API.

Maybe this can be added to the task_struct structure ...

I suggest we use RCU, and then at some point convert RCU in LKPI to use EPOCH. Like said we need an epoch tracker in the LKPI thread structure, and it needs to support recursion.

Don't optimise the xarray implementation with RCU yet, because the radix tree needs to support it first!
Just use simple mutexes for now to protect read and write operations.

No need to include atomic.h .
Fix minor whitespace issue.

sys/compat/linuxkpi/common/src/linux_xarray.c
91 ↗(On Diff #75773)

If mask must be 2**n-1, this should be asserted. Otherwise, code accept any value for the mask.

Are there any required relations between *pindex and mask ?

113 ↗(On Diff #75773)

I suspect this could be quite slow for large contig runs. Can you add an interface for radix tree to return next free index ?

hselasky added inline comments.
sys/compat/linuxkpi/common/src/linux_xarray.c
91 ↗(On Diff #75773)

Let me put an assert there to make it more clear.

hselasky added inline comments.
sys/compat/linuxkpi/common/src/linux_xarray.c
113 ↗(On Diff #75773)

I'm not sure how much we can save and if you want to insert multiple items, then use __xa_alloc_cyclic() , because it tracks the last inserted index, to avoid lookup from the start all the time.

hselasky marked an inline comment as done.

Add more asserts.

sys/compat/linuxkpi/common/src/linux_xarray.c
113 ↗(On Diff #75773)

@kib: Can you explain a bit more how the radix tree implementation could find the first free slot faster? I mean, if you want to find the first free slot you need to iterate all the leaf nodes anyway. And the lookup per node is logarithmic, so I don't think we are wasting much time there!

sys/compat/linuxkpi/common/src/linux_xarray.c
113 ↗(On Diff #75773)

I mean, that for large filled contig runs already existing in the radix tree, calling radix_tree_insert() for each allocated slot adds significant overhead for the implementation. Radix tree code loses its context on return.

You could try to use already known tree pointers to iterate further and return next free slot in case of radix_tree_insert() failure.

Fix build issue by including <sys/lock.h>

hselasky added inline comments.
sys/compat/linuxkpi/common/src/linux_xarray.c
113 ↗(On Diff #75773)

I see your point. However can we leave that for later?

There are two action items for improving the radix tree:

  1. Optimise large contig runs
  2. Add support for RCU to the radix tree, so that the xarray load can be lockless under RCU.

Action item #2 is a bigger showstopper than #1.

I'd rather see we can get going with xarrays, than having everything optimised to the teeth :-)

Note that things like epochs/RCU provide scalability while giving some hit on single-threaded performance. And the issue of large runs, if the use creates them, also affects single-threaded performance. Anyway, it is not a blocker.

See one small note inline.

sys/compat/linuxkpi/common/include/linux/xarray.h
47 ↗(On Diff #76032)

U suffix is not needed. The value is automatically promoted to unsigned if it cannot fit into int.

This revision is now accepted and ready to land.Aug 20 2020, 8:10 PM

I'll remove the U suffix before committing. Thank you!