Currently regular vnodes very frequently move between active and free lists, e.g. when stating them. This causes significant contention in workloads which access many files in a short timespan. Long term fix would probably involve removing the active list in its current form in the first place. In the meantime I propose trylocking the list lock and in case of failure setting a special flag (VI_DEFERRED) instead of releasing the hold count, delegating it to the syncer to return the vnode. This both eliminates contention in my tests and maintains quasi-LRU especially when contention is low (or non-existent).
Sample results when doing an incremental linux kernel build on tmpfs (on a kernel with some other patches) -- make -s -j 104 bzImage:
before: 118.52s user 3772.92s system 7404% cpu 52.558 total
after: 151.06s user 1080.23s system 5031% cpu 24.473 total
Notes:
- the code opts to not try to defer if inactive processing was requested. this is because there is no clear indicator whether a vnode is unlinked. The VV_NOSYNC flag is used only bu ufs. It is trivial to add to zfs, I don't know about tmpfs.
- if memory serves ufs may have known LoR when used with SU which makes me wonder if the behavior should be opt in (should this start running into deadlocks)
- I'm not happy with the need to scan the active list. struct mount can get a new flag field which would denote there is at least one deferred vnode and could refrain from scanning otherwise. another note is that the namecache holds many directory vnodes, perhaps a marker can be inserted to separate them from the rest. then the scan for deferred items could start there