Without the patch:
```
# for i in $(seq 1 10); do rm -rf /usr/obj/*; time make -s -j96 buildkernel >/dev/null; done
285.97 real 19522.85 user 1936.66 sys
288.78 real 19520.76 user 1978.89 sys
287.94 real 19501.91 user 1971.79 sys
289.48 real 19706.73 user 1932.10 sys
290.79 real 19713.99 user 1948.50 sys
288.80 real 19685.91 user 1933.88 sys
288.42 real 19542.20 user 1984.51 sys
286.64 real 19622.04 user 1981.77 sys
288.69 real 19677.83 user 1942.67 sys
288.19 real 19562.56 user 2058.54 sys
```
With the patch:
```
# for i in $(seq 1 10); do rm -rf /usr/obj/*; time make -s -j96 buildkernel >/dev/null; done
295.40 real 20009.13 user 2093.96 sys
296.74 real 20134.21 user 2093.74 sys
298.14 real 20222.30 user 2046.39 sys
297.63 real 20196.40 user 2070.36 sys
295.27 real 20312.78 user 2047.11 sys
293.93 real 20194.31 user 2056.06 sys
297.38 real 20451.89 user 2035.46 sys
297.60 real 20138.13 user 2061.30 sys
294.96 real 20157.21 user 2044.70 sys
297.25 real 20164.89 user 2076.60 sys
```
I tested two multi-threaded applications used during the build,
ctfmerge and lld. Their runtimes don't change significantly, so
I suspect that using a read lock in pmap_fault() is unlike to provide
much benefit.
Without the patch:
```
# for i in $(seq 1 10); do time ld.lld --export-dynamic -T /usr/src/sys/conf/ldscript.arm64 -o kernel.full -X *.o; done
2.45 real 3.62 user 32.91 sys
2.39 real 3.86 user 32.02 sys
2.38 real 3.57 user 31.68 sys
2.43 real 3.91 user 32.20 sys
2.39 real 3.45 user 32.00 sys
2.34 real 3.54 user 33.38 sys
2.36 real 3.57 user 30.85 sys
2.42 real 3.53 user 34.30 sys
2.34 real 3.65 user 32.65 sys
2.41 real 3.79 user 31.73 sys
# for i in $(seq 1 10); do time ctfmerge -L VERSION -o kernel *.o; done
13.85 real 31.20 user 0.97 sys
13.84 real 31.31 user 0.83 sys
13.86 real 31.19 user 1.00 sys
13.79 real 30.94 user 0.84 sys
13.98 real 31.87 user 0.92 sys
14.08 real 31.86 user 1.04 sys
14.05 real 31.95 user 0.89 sys
13.73 real 30.88 user 0.83 sys
14.05 real 31.83 user 1.05 sys
14.05 real 31.77 user 0.94 sys
```
With the patch:
```
# for i in $(seq 1 10); do time ld.lld --export-dynamic -T /usr/src/sys/conf/ldscript.arm64 -o kernel.full -X *.o; done
2.48 real 3.51 user 36.54 sys
2.33 real 3.41 user 33.98 sys
2.30 real 3.36 user 35.73 sys
2.34 real 3.57 user 33.00 sys
2.33 real 3.55 user 35.41 sys
2.32 real 3.52 user 34.47 sys
2.31 real 3.51 user 34.28 sys
2.41 real 3.91 user 35.10 sys
2.32 real 3.85 user 34.24 sys
2.40 real 3.35 user 33.40 sys
# for i in $(seq 1 10); do time ctfmerge -L VERSION -o kernel *.o; done
13.83 real 31.57 user 0.72 sys
13.86 real 31.55 user 0.75 sys
13.96 real 31.86 user 0.75 sys
13.77 real 31.21 user 0.77 sys
13.65 real 30.83 user 0.69 sys
13.80 real 31.36 user 0.64 sys
13.65 real 30.76 user 0.78 sys
13.81 real 31.31 user 0.68 sys
13.68 real 30.88 user 0.65 sys
13.77 real 31.21 user 0.78 sys
```