Page MenuHomeFreeBSD

rb_tree: augmentation shortcut
ClosedPublic

Authored by dougm on Sep 9 2022, 4:47 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Sep 11, 4:02 PM
Unknown Object (File)
Wed, Sep 11, 2:49 PM
Unknown Object (File)
Sun, Sep 8, 6:45 PM
Unknown Object (File)
Sun, Sep 8, 5:04 PM
Unknown Object (File)
Sat, Sep 7, 9:26 PM
Unknown Object (File)
Sat, Sep 7, 3:33 PM
Unknown Object (File)
Mon, Sep 2, 5:12 AM
Unknown Object (File)
Sun, Sep 1, 2:07 PM
Subscribers

Details

Summary

RB-tree augmentation maintains data in each node of the tree that represents the product of some associative operator applied to all the nodes of the subtree rooted at that node. If a node in the tree changes, augmentation data for the node is updated for that node and all nodes on the path from that node to the tree root. However, sometimes, augmenting a node changes no data in that node, particularly if the associated operation is something involving 'max' or 'min'. If augmentation changes nothing in a node, then the work of walking to the tree root from that point is pointless, because augmentation will change nothing in those nodes either. This change makes it possible to avoid that wasted work.

Define RB_AUGMENT_CHECK as a macro much like RB_AUGMENT, but which returns a value 'true' when augmentation changes the augmentation data of a node, and false otherwise. Change code that unconditionally walks and augments to the top of tree to code that stops once an augmentation has no effect. In the case of rebalancing the tree after insertion or deletion, where previously a node rotated into the path was inevitably augmented on the march to the tree root, now check to see if it needs augmentation because the march to the tree root stopped before reaching it.

Change the augmentation function in iommu_gas.c so that it returns true/false to indicate whether the augmentation had any effect.

Test Plan

My test code

includes copies of the old and new augmentation code from iommu_gas.c.

I ran three sets of tests. All tests start by initializing 64k nodes to be moved into and out of a tree repeatedly.
The plain test does as many insertions as deletions. (-DPLAIN -DTEST_AUGMENTATION -O2)
The ins test does only insertions, and when the tree is full nulls the root pointer and repeats. (-DINS_ONLY -DTEST_AUGMENTATION -O2)
The rem test does only removals, and then the tree is empty, memcpys the tree back into place and repeats. (-DREM_ONLY -DTEST_AUGMENTATION -O2)

I counted the number of calls to the augmentation function for each test, with and without the changes to avoid extra augmentation.

    plain      ins        rem
old 4007758921 4242664268 3252700484
new 2567983743 1485416036 2234118559

I timed 64 instances of each test. The results show considerable speedup. Here are the results:

x lip3.plain.old.res
+ lip3.plain.new.res
+------------------------------------------------------------------------------+
|     +                                                                        |
|     +                                                                        |
|     +                                                                        |
|    ++                                                               x        |
|    ++                                                               x        |
|    ++     +                                                         xx     x |
|    +++    +                                                         xx     x |
|    +++++  +                                                         xx     x |
|    +++++  +                                                         xx xx  x |
|    +++++  +                                                         xx xx  x |
|    +++++  +                                                        xxxxxx  x |
|+   +++++  +                                                    x x xxxxxx  xx|
|+   ++++++++                                                    x xxxxxxxxx xx|
|++  ++++++++                                                    xxxxxxxxxxxxxx|
|    |_A__|                                                         |__MA__|   |
+------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  64     74.147468     76.479667     75.300264     75.387626     0.6243859
+  64     62.698308     64.726801     63.696598     63.827538    0.49876364
Difference at 95.0% confidence
        -11.5601 +/- 0.195789
        -15.3342% +/- 0.25971%
        (Student's t, pooled s = 0.565077)
x lip3.ins.old.res
+ lip3.ins.new.res
+------------------------------------------------------------------------------+
|       +                                  xx                                  |
|       +                                  xx                                  |
|       +                                  xx                                  |
|       +                                  xx                                  |
|       +                                  xx                                  |
|      ++ +                                xx       x                          |
|      ++++                                xx       x                          |
|      +++++                              xxx       x                          |
|      ++++++                             xxx      xx                          |
|++   +++++++  ++                         xxxx  x xxxx                         |
|++   +++++++  ++++                      xxxxxxxx xxxx                         |
|+++++++++++++ ++++                  x xxxxxxxxxx xxxxx                       x|
|    |___A___|                           |__M_A_____|                          |
+------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  64     45.513771     48.967714      46.13122     46.317636    0.48561665
+  64     42.506508     43.940611     43.155563     43.193769    0.35055281
Difference at 95.0% confidence
        -3.12387 +/- 0.146737
        -6.74444% +/- 0.316805%
        (Student's t, pooled s = 0.423504)
x lip3.rem.old.res
+ lip3.rem.new.res
+------------------------------------------------------------------------------+
|  +                                                                           |
|  +                                                                           |
|  +                                                                           |
|  +                                                                           |
|  +                                                                           |
|  +                                                                           |
|  +                                                                      x    |
|  ++                                                                    xx    |
|  ++                                                                    xx    |
|  +++                                                                   xx    |
|  +++                                                                   xx    |
|  +++                                                                   xx    |
| ++++                                                                x  xx    |
| ++++                                                                xx xx    |
| ++++                                                                xxxxx    |
|+++++                                                                xxxxx    |
|+++++                                                                xxxxx    |
|+++++                                                                xxxxx    |
|+++++                                                                xxxxxx  x|
|++++++                                                               xxxxxxxxx|
|++++++                                                            x  xxxxxxxxx|
| |A_|                                                                 |_A_|   |
+------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  64     50.236848     52.423294     51.415052     51.392083    0.43289807
+  64     36.856171     37.866601     37.326483     37.342637    0.25516036
Difference at 95.0% confidence
        -14.0494 +/- 0.123113
        -27.3378% +/- 0.239556%
        (Student's t, pooled s = 0.355322)

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

dougm requested review of this revision.Sep 9 2022, 4:47 PM
dougm created this revision.
dougm edited the test plan for this revision. (Show Details)

Change iommu_gas_augment_entry, because a fact that it was depending on is only a fact when the node has not been rotated into the path.

Make RB_RANK work when augmentation is defined by RB_AUGMENT, not RB_AUGMENT_CHECK.

Resolve conflicts with recent changes to tree.h.

dougm added a reviewer: kib.

Reverse the order of insertion of the boundary entries of the tree. The 'end' entry, with its empty range, won't be considered 'changed' by augmentation, so won't lead to an augmentation update in 'begin', so insert it first.

If empty ranges can be routinely inserted, then the conditions for defining change by augmentation would need to change.

Actually update the patch.

kib added inline comments.
share/man/man3/tree.3
631
This revision is now accepted and ready to land.Sep 17 2022, 2:30 PM

Man page fix will go into the committed change.

This revision now requires review to proceed.Sep 21 2022, 3:23 AM
share/man/man3/tree.3
630
sys/sys/tree.h
359

Why add a blank line here?

sys/sys/tree.h
372

Should we have a safety belt to ensure that both RB_AUGMENT_CHECK(x) and RB_AUGMENT(x) are not defined? Should the man page explicitly forbid both being defined?

dougm marked 2 inline comments as done.

Accept suggestions. Add comment explaining why the 'end' entry must be the first one inserted into the new tree.

sys/sys/tree.h
359

No good reason.

372

I don't mind adding something, but if someone defines correct versions of both, the code will correctly use one and ignore the other. So no harm will come to them.

sys/sys/tree.h
377

I would put the ; on the next line, so that it is more obvious that the loop body is intentionally empty.

Apply recommended style change.

This revision is now accepted and ready to land.Sep 21 2022, 4:18 AM
This revision was automatically updated to reflect the committed changes.