Details

Reviewers

brooks

Summary

Initial mergesort benchmarking script. Does not yet incorporate ministat or a driver script.

Diff Detail

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

milesfertel_college.harvard.edu created this revision.Jul 18 2017, 10:23 PM

This benchmark is my initial design without full understanding of statistical validity. Is it at all still usable or it better to just pull out the relevant bits that run the sorting code into separate files and then attempt to bench them with ministat and multiple trials. Is shell a better medium for this?

tools/tools/msortbench/mergesort_bench.c
6	I'm fairly sure i'll need to generate a makefile such that building the benchmarking script finds wiki.c correctly. Could you point me in the right direction?

In D11646#241239, @milesfertel_college.harvard.edu wrote:

This benchmark is my initial design without full understanding of statistical validity. Is it at all still usable or it better to just pull out the relevant bits that run the sorting code into separate files and then attempt to bench them with ministat and multiple trials. Is shell a better medium for this?

Just taking an average isn't really going to cut it for validity. It would be better to have main() do some argument processing and be able to perform a specified number of runs of a single time of benchmark with a selected sort algorithm (at least mergesort and wikisort, but ideally also heap sort and quick sort) and array size. An interface like sortbench quick 5 10 where 5 is number of runs and 2^10 is array size seems sensible. You could then have a shell script to make runs with different parameters.

tools/tools/msortbench/mergesort_bench.c
6	I think this fragment would yield the right result: CFLAGS+=-I${SRCTOP}/lib/libc
13	`int` here should be `SORT_TYPE` if it is to have any meaning.
83	This should probably be arc4random() or more efficiently arc4random_buf().

Updated to reflect argument based design.

tools/tools/msortbench/mergesort_bench.c
13	The reason I didn't use SORT_TYPE is because it would imply that the sorthelp function could then be used for any sort type. In fact it would only function with basic types and would provide different behavior for derived types. I think it would be better for a different comparison function to be provided for different types, for consistency.

This is looking good so far. You may want a macro to disable wikisort support so we can commit the benchmark on it's own.

tools/tools/msortbench/mergesort_bench.c
13	You'll need a different function for each type (for primitive types you might use a macro to generate them), but there rests of the code isn't modular with respect to types so `SORT_TYPE` doesn't actually mean anything.
45	These functions should probably take a function pointer rather than an enum.
150	I'd add a `usage()` function that prints an argument and exits in place of the `return 1;`

Updated bench to handle usage better and only use wiki if provided

milesfertel_college.harvard.edu added inline comments.Jul 28 2017, 4:37 PM

tools/tools/msortbench/mergesort_bench.c
79	Update to use arc4random() instead.

milesfertel_college.harvard.edu added inline comments.Jul 28 2017, 4:38 PM

tools/tools/msortbench/mergesort_bench.c
182	Is this proper protocol?

I have nearly completed a version of this script that uses function pointers rather than enums, but I worry about it. I feel like it makes the code unnecessarily messy, and when I tried to clean it up with typedefs and macros, it makes it much less readable. What do you recommend?

If you can create a Makefile for this I'll look to get it committed soon (I'll be heading to the UK next Sunday, but will be doing FreeBSD things during the week so helpful I'll be able to get this committed.)

tools/tools/msortbench/mergesort_bench.c
182	The common pattern is for usage to call `exit()` unless you're using it to implement a help option of some sort.

Update the exits and change to use arc4rand, as well as add Makefile.

milesfertel_college.harvard.edu marked 3 inline comments as done.Jul 30 2017, 5:26 PM

milesfertel_college.harvard.edu added inline comments.

tools/tools/msortbench/Makefile
7	I wrote this Makefile by reading other /tools makefiles. I've not written any Makefiles built in this way. Is this sufficient?

Added a python driver script that outputs statistical comparisons using ministat to files. Decided on python rather than shell because I felt limited by the built ins of bourne shell and the bash script I wrote was needlessly complex in order to perform the simple commands.

brooks added inline comments.Aug 24 2017, 5:49 PM

tools/tools/msortbench/Makefile
7	You'll want a MAN= line to prevent a manpage from being installed, but otherwise that's basically it.

brooks added inline comments.Aug 24 2017, 6:09 PM

tools/tools/msortbench/bench.py
1	Missing #!/usr/bin/env python

I'd suggest adding a README file describing what the tests do and what the script's expected outputs are.

I've posted an updated diff in D12677

Mergesort Benchmark
Needs ReviewPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 31566

tools/tools/msortbench/Makefile

tools/tools/msortbench/bench.py

tools/tools/msortbench/mergesort_bench.c

Mergesort BenchmarkNeeds ReviewPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 31566

tools/tools/msortbench/Makefile

tools/tools/msortbench/bench.py

tools/tools/msortbench/mergesort_bench.c

Mergesort Benchmark
Needs ReviewPublic
Actions

Revision Contents
Changeset List