Use insertion sort instead of bubble sort in TCP LRO.
Replacing the bubble sort with insertion sort gives an 80% reduction
in runtime on average, with randomized keys, for small partitions.
If the keys are pre-sorted, insertion sort runs in linear time, and
even if the keys are reversed, insertion sort is faster than bubble
sort, although not by much.
Update comment describing "tcp_lro_sort()" while at it.
Differential Revision: https://reviews.freebsd.org/D6619
Sponsored by: Mellanox Technologies
Tested by: Netflix
Suggested by: Pieter de Goeje <pieter@degoeje.nl>
Reviewed by: ed, gallatin, gnn, transport