In pctrie_remove(), if the removal of an item leaves an internal node with one child, then there are two calls to pctrie_node_store(). Only the second of these two needs to be stored with PCTRIE_LOCKED synchronization. Eliminate pointless synchronization for the first node_store in that case.
Details
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
sys/kern/subr_pctrie.c | ||
---|---|---|
856 | Why use child in the three stores above instead of a literal PCTRIE_NULL? |
Results of performing 11 buildworld tests on modified and unmodified sources, using a patch from @alc to instrument cycle counts for several page-related functions. Performance improvements are evident in the opr (vm_object_page_remove) and kmem_unback cases. Some cases that don't do much freeing may have gotten a little worse, for some reason.
original with changes #calls cycles/call #calls cycles/call alloc_gpgs: 752499 3434.372098833354 752724 3418.422265000186 alloc_gpgs: 688754 3652.5813439921944 688525 3661.66810210232 alloc_gpgs: 688309 3975.5709775696673 688468 3707.9487325482087 alloc_gpgs: 688082 3766.6048610485377 687784 3901.0714352180335 alloc_gpgs: 688213 3780.41536268568 688081 4310.255092060383 alloc_gpgs: 687883 3854.9886492325004 687778 3895.6761702177155 alloc_gpgs: 687877 3877.161751301468 687822 4045.502458484899 alloc_gpgs: 688000 4263.730816860465 687897 3878.1561265712744 alloc_gpgs: 688003 3933.9803649111996 687820 3878.117705213573 alloc_gpgs: 687803 3923.037749181088 687918 3932.454445151893 alloc_gpgs: 688833 3929.0253660901844 687819 3943.970980737665 alloc_grab: 1500 439.9313333333333 1505 448.67109634551497 alloc_grab: 324 748.7932098765432 342 922.2953216374269 alloc_grab: 386 725.6217616580311 343 839.5655976676385 alloc_grab: 284 763.1901408450705 292 926.013698630137 alloc_grab: 283 722.4805653710247 276 868.1594202898551 alloc_grab: 282 723.0744680851063 252 832.2063492063492 alloc_grab: 252 775.531746031746 292 886.8595890410959 alloc_grab: 302 729.2185430463576 295 878.9694915254237 alloc_grab: 225 737.8622222222223 201 938.8059701492538 alloc_grab: 200 735.005 178 810.0505617977528 alloc_grab: 202 673.509900990099 214 750.0280373831775 alloc: 374707025 1645.4839279834691 374717526 1659.725050850704 alloc: 374694582 1673.5760878869607 374693567 1671.4806031003995 alloc: 374696972 1668.960936022723 374696078 1669.87785437135 alloc: 374701439 1657.7697043165078 374716487 1667.2020170118642 alloc: 374673512 1660.3353595036097 374682274 1675.2052507479978 alloc: 374711665 1650.3608927546998 374689507 1688.5364120351521 alloc: 374695683 1650.4322910066728 374696192 1676.5840374379893 alloc: 374692268 1648.5880521292208 374691317 1672.9629707218435 alloc: 374685421 1666.537060936246 374640447 1684.1120833544169 alloc: 374704084 1656.4897200906944 374684512 1677.768827378165 alloc: 374691565 1646.5597222771748 374699616 1675.9975831840725 opr: 1825877 42207.950979720976 1828949 40468.94419855338 opr: 2048002 37694.7010305654 2047483 36209.549937166754 opr: 2048947 38366.00136069894 2046351 35977.37083716332 opr: 2046274 38312.004955836805 2047446 36102.1195421027 opr: 2049431 37979.117602397935 2046667 36657.77648684422 opr: 2044824 37866.99186531457 2049548 35951.91515934245 opr: 2051731 38398.331697966256 2045735 36121.996644726714 opr: 2042941 38785.87913356284 2053400 36768.10080305834 opr: 2050752 38354.771603782414 2043281 36414.52181369082 opr: 2048303 38185.69551477491 2045747 36246.42377820914 opr: 2044478 38262.74138924459 2052872 36251.66563672747 collapse_scan: 63248 14125.669033012902 63315 18055.23855326542 collapse_scan: 63370 13855.8851822629 63325 14103.989830240822 collapse_scan: 63422 14027.450127715934 63308 14081.0043438428 collapse_scan: 63444 13980.526747998234 63278 14189.625715098455 collapse_scan: 63428 14873.081241722899 63429 14106.197937851772 collapse_scan: 63376 13980.024046957838 63324 14110.894573937212 collapse_scan: 63287 13968.580119139791 63486 14166.229735689758 collapse_scan: 63320 13956.721383449147 63375 14201.002840236686 collapse_scan: 63329 13998.82721975714 63327 14286.300219495633 collapse_scan: 63267 14124.018824979847 63374 14286.674771988513 collapse_scan: 63342 13993.0072305895 63310 15099.437008371506 osplit: 20480 3368.103857421875 20480 3388.83505859375 osplit: 20484 3405.1379613356767 20483 3408.3190450617585 osplit: 20482 3367.8761351430526 20480 3444.904150390625 osplit: 20486 3401.9500634579713 20483 3451.5582678318606 osplit: 20484 3416.00820152314 20481 3416.3748840388653 osplit: 20487 3421.4768877824963 20481 3423.384307406865 osplit: 20483 3402.981203925206 20484 3408.136399140793 osplit: 20480 3430.378759765625 20485 3477.10412496949 osplit: 20480 3419.13603515625 20483 3479.576966264707 osplit: 20484 3471.5063952353057 20484 3458.8027728959187 osplit: 20485 3426.6515499145717 20483 3534.532343894937 kmem_unback: 224 4489.553571428572 216 4101.518518518518 kmem_unback: 264 3644.6401515151515 262 4273.782442748092 kmem_unback: 270 4500.2962962962965 264 4178.337121212121 kmem_unback: 268 4442.899253731343 264 4116.810606060606 kmem_unback: 270 4382.444444444444 266 4112.146616541353 kmem_unback: 266 4593.424812030075 268 4191.9067164179105 kmem_unback: 270 4517.7 268 4415.563432835821 kmem_unback: 266 4524.988721804511 266 4282.68045112782 kmem_unback: 270 4469.188888888889 266 4322.7406015037595 kmem_unback: 266 4467.402255639097 258 3410.1666666666665 kmem_unback: 270 4581.559259259259 264 4250.234848484848 kmem_back: 906 3702.6545253863133 964 3694.1659751037346 kmem_back: 271 4047.4723247232473 269 3214.5985130111526 kmem_back: 275 4814.036363636364 261 3805.0459770114944 kmem_back: 250 3971.58 269 3593.263940520446 kmem_back: 295 4289.742372881356 276 4160.355072463768 kmem_back: 283 3947.3639575971733 263 4263.019011406844 kmem_back: 267 3514.5842696629215 269 4185.126394052045 kmem_back: 267 4723.943820224719 243 4343.3127572016465 kmem_back: 254 3352.8700787401576 288 3789.4166666666665 kmem_back: 255 4220.03137254902 262 4523.885496183206 kmem_back: 287 3131.076655052265 265 4442.652830188679
sys/kern/subr_pctrie.c | ||
---|---|---|
856 | To reduce the size of the difference between this and the next iteration of D46895, where child will be a pointer parameter and not a local variable. But I don't have to do that. I'll use PCTRIE_NULL here. |
Was this on amd64? The difference between the locked and unserialized variants of pctrie_node_store() are pretty minimal there, since x86 memory ordering constraints mean that plain stores already provide the desired semantics, ignoring compiler reordering of memory operations. So, it's not too surprising that the results are a bit inconclusive. On arm64 I'd hope to see a more consistent improvement.
original with changes #calls cycles/call #calls cycles/call alloc_gpgs: 752499 3434.372098833354 752724 3418.422265000186 alloc_gpgs: 688754 3652.5813439921944 688525 3661.66810210232 alloc_gpgs: 688309 3975.5709775696673 688468 3707.9487325482087 alloc_gpgs: 688082 3766.6048610485377 687784 3901.0714352180335 alloc_gpgs: 688213 3780.41536268568 688081 4310.255092060383 alloc_gpgs: 687883 3854.9886492325004 687778 3895.6761702177155 alloc_gpgs: 687877 3877.161751301468 687822 4045.502458484899 alloc_gpgs: 688000 4263.730816860465 687897 3878.1561265712744 alloc_gpgs: 688003 3933.9803649111996 687820 3878.117705213573 alloc_gpgs: 687803 3923.037749181088 687918 3932.454445151893 alloc_gpgs: 688833 3929.0253660901844 687819 3943.970980737665 alloc_grab: 1500 439.9313333333333 1505 448.67109634551497 alloc_grab: 324 748.7932098765432 342 922.2953216374269 alloc_grab: 386 725.6217616580311 343 839.5655976676385 alloc_grab: 284 763.1901408450705 292 926.013698630137 alloc_grab: 283 722.4805653710247 276 868.1594202898551 alloc_grab: 282 723.0744680851063 252 832.2063492063492 alloc_grab: 252 775.531746031746 292 886.8595890410959 alloc_grab: 302 729.2185430463576 295 878.9694915254237 alloc_grab: 225 737.8622222222223 201 938.8059701492538 alloc_grab: 200 735.005 178 810.0505617977528 alloc_grab: 202 673.509900990099 214 750.0280373831775 alloc: 374707025 1645.4839279834691 374717526 1659.725050850704 alloc: 374694582 1673.5760878869607 374693567 1671.4806031003995 alloc: 374696972 1668.960936022723 374696078 1669.87785437135 alloc: 374701439 1657.7697043165078 374716487 1667.2020170118642 alloc: 374673512 1660.3353595036097 374682274 1675.2052507479978 alloc: 374711665 1650.3608927546998 374689507 1688.5364120351521 alloc: 374695683 1650.4322910066728 374696192 1676.5840374379893 alloc: 374692268 1648.5880521292208 374691317 1672.9629707218435 alloc: 374685421 1666.537060936246 374640447 1684.1120833544169 alloc: 374704084 1656.4897200906944 374684512 1677.768827378165 alloc: 374691565 1646.5597222771748 374699616 1675.9975831840725 opr: 1825877 42207.950979720976 1828949 40468.94419855338 opr: 2048002 37694.7010305654 2047483 36209.549937166754 opr: 2048947 38366.00136069894 2046351 35977.37083716332 opr: 2046274 38312.004955836805 2047446 36102.1195421027 opr: 2049431 37979.117602397935 2046667 36657.77648684422 opr: 2044824 37866.99186531457 2049548 35951.91515934245 opr: 2051731 38398.331697966256 2045735 36121.996644726714 opr: 2042941 38785.87913356284 2053400 36768.10080305834 opr: 2050752 38354.771603782414 2043281 36414.52181369082 opr: 2048303 38185.69551477491 2045747 36246.42377820914 opr: 2044478 38262.74138924459 2052872 36251.66563672747 collapse_scan: 63248 14125.669033012902 63315 18055.23855326542 collapse_scan: 63370 13855.8851822629 63325 14103.989830240822 collapse_scan: 63422 14027.450127715934 63308 14081.0043438428 collapse_scan: 63444 13980.526747998234 63278 14189.625715098455 collapse_scan: 63428 14873.081241722899 63429 14106.197937851772 collapse_scan: 63376 13980.024046957838 63324 14110.894573937212 collapse_scan: 63287 13968.580119139791 63486 14166.229735689758 collapse_scan: 63320 13956.721383449147 63375 14201.002840236686 collapse_scan: 63329 13998.82721975714 63327 14286.300219495633 collapse_scan: 63267 14124.018824979847 63374 14286.674771988513 collapse_scan: 63342 13993.0072305895 63310 15099.437008371506 osplit: 20480 3368.103857421875 20480 3388.83505859375 osplit: 20484 3405.1379613356767 20483 3408.3190450617585 osplit: 20482 3367.8761351430526 20480 3444.904150390625 osplit: 20486 3401.9500634579713 20483 3451.5582678318606 osplit: 20484 3416.00820152314 20481 3416.3748840388653 osplit: 20487 3421.4768877824963 20481 3423.384307406865 osplit: 20483 3402.981203925206 20484 3408.136399140793 osplit: 20480 3430.378759765625 20485 3477.10412496949 osplit: 20480 3419.13603515625 20483 3479.576966264707 osplit: 20484 3471.5063952353057 20484 3458.8027728959187 osplit: 20485 3426.6515499145717 20483 3534.532343894937 kmem_unback: 224 4489.553571428572 216 4101.518518518518 kmem_unback: 264 3644.6401515151515 262 4273.782442748092 kmem_unback: 270 4500.2962962962965 264 4178.337121212121 kmem_unback: 268 4442.899253731343 264 4116.810606060606 kmem_unback: 270 4382.444444444444 266 4112.146616541353 kmem_unback: 266 4593.424812030075 268 4191.9067164179105 kmem_unback: 270 4517.7 268 4415.563432835821 kmem_unback: 266 4524.988721804511 266 4282.68045112782 kmem_unback: 270 4469.188888888889 266 4322.7406015037595 kmem_unback: 266 4467.402255639097 258 3410.1666666666665 kmem_unback: 270 4581.559259259259 264 4250.234848484848 kmem_back: 906 3702.6545253863133 964 3694.1659751037346 kmem_back: 271 4047.4723247232473 269 3214.5985130111526 kmem_back: 275 4814.036363636364 261 3805.0459770114944 kmem_back: 250 3971.58 269 3593.263940520446 kmem_back: 295 4289.742372881356 276 4160.355072463768 kmem_back: 283 3947.3639575971733 263 4263.019011406844 kmem_back: 267 3514.5842696629215 269 4185.126394052045 kmem_back: 267 4723.943820224719 243 4343.3127572016465 kmem_back: 254 3352.8700787401576 288 3789.4166666666665 kmem_back: 255 4220.03137254902 262 4523.885496183206 kmem_back: 287 3131.076655052265 265 4442.652830188679