Rewrite the first search loop in vm_phys_unfree_page.
The new version has fewer iterations, as each iteration produces a different m_set value, which is not the case now. Each iteration has two exit tests per iteration, rather than three.
The search loop size is reduced from this:
Original (58 bytes):
20a0: 49 c7 c1 00 e0 ff ff movq $-0x2000, %r9 # imm = 0xE000 20a7: 49 d3 e1 shlq %cl, %r9 20aa: 49 21 c1 andq %rax, %r9 20ad: 4d 29 d9 subq %r11, %r9 20b0: 72 6c jb 0x211e <vm_phys_unfree_page+0xce> 20b2: 4c 8d 51 01 leaq 0x1(%rcx), %r10 20b6: 48 8b 33 movq (%rbx), %rsi 20b9: 49 c1 e9 0c shrq $0xc, %r9 20bd: 4d 6b f9 68 imulq $0x68, %r9, %r15 20c1: 46 0f b6 4c 3e 5c movzbl 0x5c(%rsi,%r15), %r9d 20c7: 41 80 f9 0d cmpb $0xd, %r9b 20cb: 41 0f 94 c6 sete %r14b 20cf: 75 09 jne 0x20da <vm_phys_unfree_page+0x8a> 20d1: 48 83 f9 0b cmpq $0xb, %rcx 20d5: 4c 89 d1 movq %r10, %rcx 20d8: 72 c6 jb 0x20a0 <vm_phys_unfree_page+0x50>
to this (36 bytes):
20a0: 49 89 c0 movq %rax, %r8 20a3: 48 ff c8 decq %rax 20a6: 4c 21 c0 andq %r8, %rax 20a9: 49 89 c2 movq %rax, %r10 20ac: 49 29 d2 subq %rdx, %r10 20af: 49 c1 ea 0c shrq $0xc, %r10 20b3: 4c 39 c8 cmpq %r9, %rax 20b6: 76 0c jbe 0x20c4 <vm_phys_unfree_page+0x74> 20b8: 4d 6b da 68 imulq $0x68, %r10, %r11 20bc: 42 80 7c 19 5c 0d cmpb $0xd, 0x5c(%rcx,%r11) 20c2: 74 dc je 0x20a0 <vm_phys_unfree_page+0x50>
The size of the function is reduced by 16 bytes.