Most architectures we support (except for riscv64) have instructions
to compute these functions very quickly. Replace old code with the
ctz and clz builtin functions, allowing clang to generate good code
for all architectures.
As a consequence, toss out arm and i386 ffs() implementations as
clang generates comparable code (for i386) or better code (for arm)
than the assembly implementations.
Unit tests are proposed in D40729 to these these new implementations.
Sponsored by: FreeBSD Foundation