I have started some benchmarking, testing kernel builds on tmpfs and
single-threaded sendfile performance. So far the results show no difference
or a small improvement on a 16-core Haswell system.
Drew has tested the patch on a couple of systems at Netflix. In their
workload, the vm page locks are the most highly contended locks in
the kernel; with this patch they effectively disappear from lockstat
profiles and we see a 1-2% CPU usage decrease.