Page MenuHomeFreeBSD

pctrie: unlock node store in remove
ClosedPublic

Authored by dougm on Feb 17 2025, 5:59 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Mar 4, 3:45 PM
Unknown Object (File)
Sat, Mar 1, 12:43 PM
Unknown Object (File)
Sat, Mar 1, 6:48 AM
Unknown Object (File)
Feb 23 2025, 5:08 PM
Unknown Object (File)
Feb 21 2025, 6:28 PM
Unknown Object (File)
Feb 20 2025, 11:19 AM
Unknown Object (File)
Feb 20 2025, 11:15 AM
Unknown Object (File)
Feb 20 2025, 8:40 AM
Subscribers

Details

Summary

In pctrie_remove(), if the removal of an item leaves an internal node with one child, then there are two calls to pctrie_node_store(). Only the second of these two needs to be stored with PCTRIE_LOCKED synchronization. Eliminate pointless synchronization for the first node_store in that case.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

dougm requested review of this revision.Feb 17 2025, 5:59 PM
dougm created this revision.
markj added inline comments.
sys/kern/subr_pctrie.c
856

Why use child in the three stores above instead of a literal PCTRIE_NULL?

This revision is now accepted and ready to land.Feb 18 2025, 2:12 PM

Results of performing 11 buildworld tests on modified and unmodified sources, using a patch from @alc to instrument cycle counts for several page-related functions. Performance improvements are evident in the opr (vm_object_page_remove) and kmem_unback cases. Some cases that don't do much freeing may have gotten a little worse, for some reason.

                original                            with changes
                #calls      cycles/call             #calls      cycles/call
alloc_gpgs:     752499      3434.372098833354       752724      3418.422265000186 
alloc_gpgs:     688754      3652.5813439921944      688525      3661.66810210232  
alloc_gpgs:     688309      3975.5709775696673      688468      3707.9487325482087
alloc_gpgs:     688082      3766.6048610485377      687784      3901.0714352180335
alloc_gpgs:     688213      3780.41536268568        688081      4310.255092060383 
alloc_gpgs:     687883      3854.9886492325004      687778      3895.6761702177155
alloc_gpgs:     687877      3877.161751301468       687822      4045.502458484899 
alloc_gpgs:     688000      4263.730816860465       687897      3878.1561265712744
alloc_gpgs:     688003      3933.9803649111996      687820      3878.117705213573 
alloc_gpgs:     687803      3923.037749181088       687918      3932.454445151893 
alloc_gpgs:     688833      3929.0253660901844      687819      3943.970980737665 
alloc_grab:     1500        439.9313333333333       1505        448.67109634551497
alloc_grab:     324         748.7932098765432       342         922.2953216374269 
alloc_grab:     386         725.6217616580311       343         839.5655976676385 
alloc_grab:     284         763.1901408450705       292         926.013698630137  
alloc_grab:     283         722.4805653710247       276         868.1594202898551 
alloc_grab:     282         723.0744680851063       252         832.2063492063492 
alloc_grab:     252         775.531746031746        292         886.8595890410959 
alloc_grab:     302         729.2185430463576       295         878.9694915254237 
alloc_grab:     225         737.8622222222223       201         938.8059701492538 
alloc_grab:     200         735.005                 178         810.0505617977528 
alloc_grab:     202         673.509900990099        214         750.0280373831775 
alloc:          374707025   1645.4839279834691      374717526   1659.725050850704 
alloc:          374694582   1673.5760878869607      374693567   1671.4806031003995
alloc:          374696972   1668.960936022723       374696078   1669.87785437135  
alloc:          374701439   1657.7697043165078      374716487   1667.2020170118642
alloc:          374673512   1660.3353595036097      374682274   1675.2052507479978
alloc:          374711665   1650.3608927546998      374689507   1688.5364120351521
alloc:          374695683   1650.4322910066728      374696192   1676.5840374379893
alloc:          374692268   1648.5880521292208      374691317   1672.9629707218435
alloc:          374685421   1666.537060936246       374640447   1684.1120833544169
alloc:          374704084   1656.4897200906944      374684512   1677.768827378165 
alloc:          374691565   1646.5597222771748      374699616   1675.9975831840725
opr:            1825877     42207.950979720976      1828949     40468.94419855338 
opr:            2048002     37694.7010305654        2047483     36209.549937166754
opr:            2048947     38366.00136069894       2046351     35977.37083716332 
opr:            2046274     38312.004955836805      2047446     36102.1195421027  
opr:            2049431     37979.117602397935      2046667     36657.77648684422 
opr:            2044824     37866.99186531457       2049548     35951.91515934245 
opr:            2051731     38398.331697966256      2045735     36121.996644726714
opr:            2042941     38785.87913356284       2053400     36768.10080305834 
opr:            2050752     38354.771603782414      2043281     36414.52181369082 
opr:            2048303     38185.69551477491       2045747     36246.42377820914 
opr:            2044478     38262.74138924459       2052872     36251.66563672747 
collapse_scan:  63248       14125.669033012902      63315       18055.23855326542 
collapse_scan:  63370       13855.8851822629        63325       14103.989830240822
collapse_scan:  63422       14027.450127715934      63308       14081.0043438428  
collapse_scan:  63444       13980.526747998234      63278       14189.625715098455
collapse_scan:  63428       14873.081241722899      63429       14106.197937851772
collapse_scan:  63376       13980.024046957838      63324       14110.894573937212
collapse_scan:  63287       13968.580119139791      63486       14166.229735689758
collapse_scan:  63320       13956.721383449147      63375       14201.002840236686
collapse_scan:  63329       13998.82721975714       63327       14286.300219495633
collapse_scan:  63267       14124.018824979847      63374       14286.674771988513
collapse_scan:  63342       13993.0072305895        63310       15099.437008371506
osplit:         20480       3368.103857421875       20480       3388.83505859375  
osplit:         20484       3405.1379613356767      20483       3408.3190450617585
osplit:         20482       3367.8761351430526      20480       3444.904150390625 
osplit:         20486       3401.9500634579713      20483       3451.5582678318606
osplit:         20484       3416.00820152314        20481       3416.3748840388653
osplit:         20487       3421.4768877824963      20481       3423.384307406865 
osplit:         20483       3402.981203925206       20484       3408.136399140793 
osplit:         20480       3430.378759765625       20485       3477.10412496949  
osplit:         20480       3419.13603515625        20483       3479.576966264707 
osplit:         20484       3471.5063952353057      20484       3458.8027728959187
osplit:         20485       3426.6515499145717      20483       3534.532343894937 
kmem_unback:    224         4489.553571428572       216         4101.518518518518 
kmem_unback:    264         3644.6401515151515      262         4273.782442748092 
kmem_unback:    270         4500.2962962962965      264         4178.337121212121 
kmem_unback:    268         4442.899253731343       264         4116.810606060606 
kmem_unback:    270         4382.444444444444       266         4112.146616541353 
kmem_unback:    266         4593.424812030075       268         4191.9067164179105
kmem_unback:    270         4517.7                  268         4415.563432835821 
kmem_unback:    266         4524.988721804511       266         4282.68045112782  
kmem_unback:    270         4469.188888888889       266         4322.7406015037595
kmem_unback:    266         4467.402255639097       258         3410.1666666666665
kmem_unback:    270         4581.559259259259       264         4250.234848484848 
kmem_back:      906         3702.6545253863133      964         3694.1659751037346
kmem_back:      271         4047.4723247232473      269         3214.5985130111526
kmem_back:      275         4814.036363636364       261         3805.0459770114944
kmem_back:      250         3971.58                 269         3593.263940520446 
kmem_back:      295         4289.742372881356       276         4160.355072463768 
kmem_back:      283         3947.3639575971733      263         4263.019011406844 
kmem_back:      267         3514.5842696629215      269         4185.126394052045 
kmem_back:      267         4723.943820224719       243         4343.3127572016465
kmem_back:      254         3352.8700787401576      288         3789.4166666666665
kmem_back:      255         4220.03137254902        262         4523.885496183206 
kmem_back:      287         3131.076655052265       265         4442.652830188679
sys/kern/subr_pctrie.c
856

To reduce the size of the difference between this and the next iteration of D46895, where child will be a pointer parameter and not a local variable. But I don't have to do that. I'll use PCTRIE_NULL here.

Results of performing 11 buildworld tests on modified and unmodified sources, using a patch from @alc to instrument cycle counts for several page-related functions. Performance improvements are evident in the opr (vm_object_page_remove) and kmem_unback cases. Some cases that don't do much freeing may have gotten a little worse, for some reason.

Was this on amd64? The difference between the locked and unserialized variants of pctrie_node_store() are pretty minimal there, since x86 memory ordering constraints mean that plain stores already provide the desired semantics, ignoring compiler reordering of memory operations. So, it's not too surprising that the results are a bit inconclusive. On arm64 I'd hope to see a more consistent improvement.

                original                            with changes
                #calls      cycles/call             #calls      cycles/call
alloc_gpgs:     752499      3434.372098833354       752724      3418.422265000186 
alloc_gpgs:     688754      3652.5813439921944      688525      3661.66810210232  
alloc_gpgs:     688309      3975.5709775696673      688468      3707.9487325482087
alloc_gpgs:     688082      3766.6048610485377      687784      3901.0714352180335
alloc_gpgs:     688213      3780.41536268568        688081      4310.255092060383 
alloc_gpgs:     687883      3854.9886492325004      687778      3895.6761702177155
alloc_gpgs:     687877      3877.161751301468       687822      4045.502458484899 
alloc_gpgs:     688000      4263.730816860465       687897      3878.1561265712744
alloc_gpgs:     688003      3933.9803649111996      687820      3878.117705213573 
alloc_gpgs:     687803      3923.037749181088       687918      3932.454445151893 
alloc_gpgs:     688833      3929.0253660901844      687819      3943.970980737665 
alloc_grab:     1500        439.9313333333333       1505        448.67109634551497
alloc_grab:     324         748.7932098765432       342         922.2953216374269 
alloc_grab:     386         725.6217616580311       343         839.5655976676385 
alloc_grab:     284         763.1901408450705       292         926.013698630137  
alloc_grab:     283         722.4805653710247       276         868.1594202898551 
alloc_grab:     282         723.0744680851063       252         832.2063492063492 
alloc_grab:     252         775.531746031746        292         886.8595890410959 
alloc_grab:     302         729.2185430463576       295         878.9694915254237 
alloc_grab:     225         737.8622222222223       201         938.8059701492538 
alloc_grab:     200         735.005                 178         810.0505617977528 
alloc_grab:     202         673.509900990099        214         750.0280373831775 
alloc:          374707025   1645.4839279834691      374717526   1659.725050850704 
alloc:          374694582   1673.5760878869607      374693567   1671.4806031003995
alloc:          374696972   1668.960936022723       374696078   1669.87785437135  
alloc:          374701439   1657.7697043165078      374716487   1667.2020170118642
alloc:          374673512   1660.3353595036097      374682274   1675.2052507479978
alloc:          374711665   1650.3608927546998      374689507   1688.5364120351521
alloc:          374695683   1650.4322910066728      374696192   1676.5840374379893
alloc:          374692268   1648.5880521292208      374691317   1672.9629707218435
alloc:          374685421   1666.537060936246       374640447   1684.1120833544169
alloc:          374704084   1656.4897200906944      374684512   1677.768827378165 
alloc:          374691565   1646.5597222771748      374699616   1675.9975831840725
opr:            1825877     42207.950979720976      1828949     40468.94419855338 
opr:            2048002     37694.7010305654        2047483     36209.549937166754
opr:            2048947     38366.00136069894       2046351     35977.37083716332 
opr:            2046274     38312.004955836805      2047446     36102.1195421027  
opr:            2049431     37979.117602397935      2046667     36657.77648684422 
opr:            2044824     37866.99186531457       2049548     35951.91515934245 
opr:            2051731     38398.331697966256      2045735     36121.996644726714
opr:            2042941     38785.87913356284       2053400     36768.10080305834 
opr:            2050752     38354.771603782414      2043281     36414.52181369082 
opr:            2048303     38185.69551477491       2045747     36246.42377820914 
opr:            2044478     38262.74138924459       2052872     36251.66563672747 
collapse_scan:  63248       14125.669033012902      63315       18055.23855326542 
collapse_scan:  63370       13855.8851822629        63325       14103.989830240822
collapse_scan:  63422       14027.450127715934      63308       14081.0043438428  
collapse_scan:  63444       13980.526747998234      63278       14189.625715098455
collapse_scan:  63428       14873.081241722899      63429       14106.197937851772
collapse_scan:  63376       13980.024046957838      63324       14110.894573937212
collapse_scan:  63287       13968.580119139791      63486       14166.229735689758
collapse_scan:  63320       13956.721383449147      63375       14201.002840236686
collapse_scan:  63329       13998.82721975714       63327       14286.300219495633
collapse_scan:  63267       14124.018824979847      63374       14286.674771988513
collapse_scan:  63342       13993.0072305895        63310       15099.437008371506
osplit:         20480       3368.103857421875       20480       3388.83505859375  
osplit:         20484       3405.1379613356767      20483       3408.3190450617585
osplit:         20482       3367.8761351430526      20480       3444.904150390625 
osplit:         20486       3401.9500634579713      20483       3451.5582678318606
osplit:         20484       3416.00820152314        20481       3416.3748840388653
osplit:         20487       3421.4768877824963      20481       3423.384307406865 
osplit:         20483       3402.981203925206       20484       3408.136399140793 
osplit:         20480       3430.378759765625       20485       3477.10412496949  
osplit:         20480       3419.13603515625        20483       3479.576966264707 
osplit:         20484       3471.5063952353057      20484       3458.8027728959187
osplit:         20485       3426.6515499145717      20483       3534.532343894937 
kmem_unback:    224         4489.553571428572       216         4101.518518518518 
kmem_unback:    264         3644.6401515151515      262         4273.782442748092 
kmem_unback:    270         4500.2962962962965      264         4178.337121212121 
kmem_unback:    268         4442.899253731343       264         4116.810606060606 
kmem_unback:    270         4382.444444444444       266         4112.146616541353 
kmem_unback:    266         4593.424812030075       268         4191.9067164179105
kmem_unback:    270         4517.7                  268         4415.563432835821 
kmem_unback:    266         4524.988721804511       266         4282.68045112782  
kmem_unback:    270         4469.188888888889       266         4322.7406015037595
kmem_unback:    266         4467.402255639097       258         3410.1666666666665
kmem_unback:    270         4581.559259259259       264         4250.234848484848 
kmem_back:      906         3702.6545253863133      964         3694.1659751037346
kmem_back:      271         4047.4723247232473      269         3214.5985130111526
kmem_back:      275         4814.036363636364       261         3805.0459770114944
kmem_back:      250         3971.58                 269         3593.263940520446 
kmem_back:      295         4289.742372881356       276         4160.355072463768 
kmem_back:      283         3947.3639575971733      263         4263.019011406844 
kmem_back:      267         3514.5842696629215      269         4185.126394052045 
kmem_back:      267         4723.943820224719       243         4343.3127572016465
kmem_back:      254         3352.8700787401576      288         3789.4166666666665
kmem_back:      255         4220.03137254902        262         4523.885496183206 
kmem_back:      287         3131.076655052265       265         4442.652830188679

Results of performing 11 buildworld tests on modified and unmodified sources, using a patch from @alc to instrument cycle counts for several page-related functions. Performance improvements are evident in the opr (vm_object_page_remove) and kmem_unback cases. Some cases that don't do much freeing may have gotten a little worse, for some reason.

Was this on amd64? The difference between the locked and unserialized variants of pctrie_node_store() are pretty minimal there, since x86 memory ordering constraints mean that plain stores already provide the desired semantics, ignoring compiler reordering of memory operations. So, it's not too surprising that the results are a bit inconclusive. On arm64 I'd hope to see a more consistent improvement.

Yesl

This revision was automatically updated to reflect the committed changes.