Page MenuHomeFreeBSD

Mergesort Benchmark
Needs ReviewPublic

Authored by milesfertel_college.harvard.edu on Jul 18 2017, 10:23 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, May 3, 2:04 PM
Unknown Object (File)
Feb 29 2024, 2:50 PM
Unknown Object (File)
Feb 29 2024, 11:19 AM
Unknown Object (File)
Feb 18 2024, 12:39 PM
Unknown Object (File)
Dec 25 2023, 9:24 PM
Unknown Object (File)
Dec 22 2023, 10:18 PM
Unknown Object (File)
Nov 18 2023, 8:09 PM
Unknown Object (File)
Nov 5 2023, 3:59 AM
Subscribers
None

Details

Reviewers
brooks
Summary

Initial mergesort benchmarking script. Does not yet incorporate ministat or a driver script.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

This benchmark is my initial design without full understanding of statistical validity. Is it at all still usable or it better to just pull out the relevant bits that run the sorting code into separate files and then attempt to bench them with ministat and multiple trials. Is shell a better medium for this?

tools/tools/msortbench/mergesort_bench.c
6

I'm fairly sure i'll need to generate a makefile such that building the benchmarking script finds wiki.c correctly. Could you point me in the right direction?

This benchmark is my initial design without full understanding of statistical validity. Is it at all still usable or it better to just pull out the relevant bits that run the sorting code into separate files and then attempt to bench them with ministat and multiple trials. Is shell a better medium for this?

Just taking an average isn't really going to cut it for validity. It would be better to have main() do some argument processing and be able to perform a specified number of runs of a single time of benchmark with a selected sort algorithm (at least mergesort and wikisort, but ideally also heap sort and quick sort) and array size. An interface like sortbench quick 5 10 where 5 is number of runs and 2^10 is array size seems sensible. You could then have a shell script to make runs with different parameters.

tools/tools/msortbench/mergesort_bench.c
6

I think this fragment would yield the right result:

CFLAGS+=-I${SRCTOP}/lib/libc
13

int here should be SORT_TYPE if it is to have any meaning.

83

This should probably be arc4random() or more efficiently arc4random_buf().

Updated to reflect argument based design.

tools/tools/msortbench/mergesort_bench.c
13

The reason I didn't use SORT_TYPE is because it would imply that the sorthelp function could then be used for any sort type. In fact it would only function with basic types and would provide different behavior for derived types. I think it would be better for a different comparison function to be provided for different types, for consistency.

This is looking good so far. You may want a macro to disable wikisort support so we can commit the benchmark on it's own.

tools/tools/msortbench/mergesort_bench.c
13

You'll need a different function for each type (for primitive types you might use a macro to generate them), but there rests of the code isn't modular with respect to types so SORT_TYPE doesn't actually mean anything.

45

These functions should probably take a function pointer rather than an enum.

150

I'd add a usage() function that prints an argument and exits in place of the return 1;

milesfertel_college.harvard.edu marked 2 inline comments as done.

Updated bench to handle usage better and only use wiki if provided

tools/tools/msortbench/mergesort_bench.c
79

Update to use arc4random() instead.

tools/tools/msortbench/mergesort_bench.c
182

Is this proper protocol?

I have nearly completed a version of this script that uses function pointers rather than enums, but I worry about it. I feel like it makes the code unnecessarily messy, and when I tried to clean it up with typedefs and macros, it makes it much less readable. What do you recommend?

If you can create a Makefile for this I'll look to get it committed soon (I'll be heading to the UK next Sunday, but will be doing FreeBSD things during the week so helpful I'll be able to get this committed.)

tools/tools/msortbench/mergesort_bench.c
182

The common pattern is for usage to call exit() unless you're using it to implement a help option of some sort.

Update the exits and change to use arc4rand, as well as add Makefile.

milesfertel_college.harvard.edu added inline comments.
tools/tools/msortbench/Makefile
7

I wrote this Makefile by reading other /tools makefiles. I've not written any Makefiles built in this way. Is this sufficient?

Added a python driver script that outputs statistical comparisons using ministat to files. Decided on python rather than shell because I felt limited by the built ins of bourne shell and the bash script I wrote was needlessly complex in order to perform the simple commands.

tools/tools/msortbench/Makefile
7

You'll want a

MAN=

line to prevent a manpage from being installed, but otherwise that's basically it.

tools/tools/msortbench/bench.py
1

Missing

#!/usr/bin/env python

I'd suggest adding a README file describing what the tests do and what the script's expected outputs are.