A test program which malloc()s a 64 byte buffer and frees it, in a loop,
completes about 25% faster than before on a EC2 Graviton instance. On
the same system I see no significant difference in buildkernel times.
On an amd64 NUC, a `sort -n` of some test data was roughly 10% faster with
this change. On the Graviton it was roughly 5% faster.