The kernels qsort() routine can in worst case spend O(N*N) amount of comparisons before the result is sorted.
Because the sorting key is very small, 64-bits, we can use a bit-slice sorter algorithmn instead, which is faster than mergesort() and comparable() to qsort().