Expand the bitcount* API to support 64-bit integers, plain ints and longs
and create a "hidden" API that can be used in other system headers without
adding namespace pollution.
- If the POPCNT instruction is enabled at compile time, use builtin_popcount*() to implement bitcount*(), otherwise fall back to software implementations.
- Use the existing bitcount16() and bitcount32() from <sys/systm.h> to implement the non-POPCNT bitcount16() and bitcount32() in <sys/types.h>.
- For the non-POPCNT bitcount64(), use a similar SWAR method on 64-bit systems. For 32-bit systems, use two bitcount32() operations on the two halves.
- Use bitcount32() to provide a bitcount() that operates on plain ints.
- Use either bitcount32() or bitcount64() to provide a __bitcountl() that operates on longs.
- Add public bitcount*() wrappers for __bitcount*() for use in the kernel in <sys/libkern.h>.
- Use builtinl() instead of builtin_popcountl() in BIT_COUNT().
Discussed with: bde