Page MenuHomeFreeBSD

Make quicksort or mergesort configurable for the LRO code.
AbandonedPublic

Authored by hselasky on Jan 19 2016, 7:25 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Dec 9, 12:30 AM
Unknown Object (File)
Oct 24 2024, 3:40 PM
Unknown Object (File)
Oct 22 2024, 3:17 AM
Unknown Object (File)
Oct 18 2024, 12:24 AM
Unknown Object (File)
Sep 21 2024, 9:26 AM
Unknown Object (File)
Sep 10 2024, 12:35 AM
Unknown Object (File)
Sep 8 2024, 10:05 PM
Unknown Object (File)
Sep 8 2024, 7:05 PM
Subscribers

Details

Reviewers
gallatin
rstone
Group Reviewers
transport
Summary

In the worst case, qsort() can take O(n**2) time and consume O(n) stack
space. Use mergesort instead.

Suggested by: Ryan Stone <rstone@freebsd.org>

Further fix a bug in the compare routine when comparing 32-bit integers we need to use a 64-bit variable.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

hselasky retitled this revision from to Use mergesort instead of quicksort in LRO code.
hselasky updated this object.
hselasky edited the test plan for this revision. (Show Details)
hselasky added a reviewer: gallatin.
hselasky set the repository for this revision to rS FreeBSD src repository - subversion.

Can we see some numbers that support switching from the well known, and validated, qsort() to this private version of mergesort? If this is a win then we probably want an abstracted version of this to sit along side our qsort.

@gnn : Yes, Andrew is going to test it, probably next week.

Fix compare routine. Need to use 64-bit temporary variables when comparing 32-bit numbers.
Optimize away the sequence number, and make the mergesort preserve the packet order instead.

@gnn : Yes, Andrew is going to test it, probably next week.

After running the latest version of the patch (with the fixed compare) for several days on internet facing hosts with ~90K connections, I cannot find much difference between the qsort and mergesort versions. Both aggregate at the same rate, and use roughly the same amount of CPU. Honestly, for our workloads, I'm quite happy with either.

I think choosing mergesort over qsort comes down to rstone's comments on the commit of the qsort version ("In the worst case, qsort() can take O(n**2) time and consume O(n) stack space. Is there a DOS concern here?")

If the performance is the same then I think using the tried and true algorithm wins out over putting a new one in specifically just for this case.

Hi,

@gnn

The reason we don't use a factored out mergesort() is that mergesort() needs a temporary working array, which it allocates using malloc(). Then it becomes very slow.

Given that this code is running in the fast path it makes sense to unroll/optimize away constants.

Did you have a look at the existing qsort() code v.s. this one?

BTW: Would it help if I factored out the sorting function to be a generic mergesort for pointers in libkern.h ?

For example:

void
mergesort_by_pointer(void ptr, void temp, unsigned max, compare_func_t);

--HPS

hselasky retitled this revision from Use mergesort instead of quicksort in LRO code to Make quicksort or mergesort configurable for the LRO code..Feb 5 2016, 8:55 AM
hselasky edited edge metadata.
hselasky edited edge metadata.

Add more missed 32-bit to 64-bit casts. Needed for correct sorting operation.

Have we tested the case where there is only a single flow, so all of the flowids are equal? That tends to be a very bad case for qsort.

@rstone

The entries are never equal due to the sequence number stamped on each mbuf. Is this also the case when the array appears already sorted?

Hi,

Our FreeBSD qsort() routine has been specifically modified to not exhibit the so-called QuickSort worst case behaviour of O(N**2) sorting time. This is not documented in our source code, but here:

http://cs.fit.edu/~pkc/classes/writing/samples/bentley93engineering.pdf

So I think DOS w.r.t O(N**2) is not a valid consern.

Thank you for your input Ryan.

BTW:

Drew Gallatin has tested our qsort() v.s. my mergesort() and found that:

"It looks like mergesort is nearly 2x as expensive. (4.7% vs 2.5%)"