With this change, the time that it takes to do a "make -j8 buildworld" on an Amazon EC2 a1.2xlarge (8 cores, 16 GB RAM) is reduced by 9%.
Since the a1.2xlarge cores are Cortex-A72's with PIPT L1 I-Caches, I also tested this patch with arm64_icache_sync_range implementation changed from
```
ENTRY(arm64_icache_sync_range)
/*
* XXX Temporary solution - I-cache flush should be range based for
* PIPT cache or IALLUIS for VIVT or VIPT caches
*/
/* cache_handle_range dcop = cvau, ic = 1, icop = ivau */
cache_handle_range dcop = cvau
ic ialluis
dsb ish
isb
ret
END(arm64_icache_sync_range)
```
to
```
ENTRY(arm64_icache_sync_range)
cache_handle_range dcop = cvau, ic = 1, icop = ivau
ret
END(arm64_icache_sync_range)
```