Taken from musl commit cef0f289
Details
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
beside strange style, I would love it to come
Please also have a look at memchr/strchr :)
I wondered if we should use a vendor branch for this, but suspect it's not worth doing so.
| lib/libc/string/memmem.c | ||
|---|---|---|
| 45–56 | Would it be worth applying this strategy to 7 and 8-byte needles as well? | |
| lib/libc/string/strstr.c | ||
| 65 | Using a shift table is an extension to the two-way algorithm. The original algorithm can be found on page 672 of this PDF: http://monge.univ-mlv.fr/~mac/Articles-PDF/CP-1991-jacm.pdf. Using a shift table is awesome, because it allows us to achieve sub-linear running time for certain inputs, which is good. The only question that I have is what the advantage of this approach is for strstr(). There is no way we can get a sublinear running time in the first place. | |
| lib/libc/string/strstr.c | ||
|---|---|---|
| 65 | It looks like Rich explains what he did here: http://stackoverflow.com/questions/3183582/what-is-the-fastest-substring-search-algorithm I see a cheeky comment from @cem there too :-) | |
| lib/libc/string/strstr.c | ||
|---|---|---|
| 65 | Ah, yes. A younger and even more foolish time :-). | |
What is the layout upstream? If we think we will adopt more things from musl (and given the fact that you've already had to pull in one commit from upstream), I think a vendor branch and sticking this in contrib/musl might be sensible.
Also, have you run any tests? For the SSE ones I've tried to write some simple tests (though not completely exhaustive). I haven't tried memem and strstr though, but perhaps they might serve as inspiration (one example below, but all of the little programs in this directory have a 'tests' mode):
https://github.com/bsdjhb/freebsd/blob/libc_sse/tools/tools/libcsse/memcmp/main.c#L134
Even though I am looking at a few other string routines in musl I don't think this is worth putting in a vendor branch. We don't expect these to change significantly upstream, and I don't think the complexity of going through a vendor branch and obfuscation of using reach-over Makefiles is worth it. If we had a plan to migrate wholesale to musl's string routines it would be a different matter.
Also, have you run any tests? For the SSE ones I've tried to write some simple tests (though not completely exhaustive). I haven't tried memem and strstr though, but perhaps they might serve as inspiration (one example below, but all of the little programs in this directory have a 'tests' mode):
https://github.com/bsdjhb/freebsd/blob/libc_sse/tools/tools/libcsse/memcmp/main.c#L134
I've run correctness tests including the very small set we have in contrib/netbsd-tests (which now include a test for the fixed bug).
I'll look at cribbing from your libc_sse branch for benchmarking.
The results for a pathological case have the musl strstr completing in about 0.05% of the time.
This was with big = 'a' * 65534 + 'b' and little = 'a' * 16382 + 'b'.
x libc_large
+ musl_large
+------------------------------------------------------------------------------+
|+ |
|+ |
|+ |
|+ |
|+ |
|+ |
|+ |
|+ x x |
|+ xxx |
|+ xxxxx|
|A |AM| |
+------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 10 1.9370947e+09 2.0536267e+09 1.9890161e+09 1.9854802e+09 38220526
+ 10 1008687 1073904 1018041 1025801.2 22588.952
Difference at 95.0% confidence
-1.98445e+09 +/- 2.53935e+07
-99.9483% +/- 1.27896%
(Student's t, pooled s = 2.7026e+07)With little = 'a' * 30 + 'b':
x libc_large
+ musl_large
+------------------------------------------------------------------------------+
|++ x |
|++ xx |
|++ + xx |
|++ + xx x x x|
||A| |_____M___A_________| |
+------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 10 5062292 7740604 5124418.5 5491536.3 848585.89
+ 10 1075063 1322832 1122805 1149504.3 90691.712
Difference at 95.0% confidence
-4.34203e+06 +/- 567007
-79.0677% +/- 10.3251%
(Student's t, pooled s = 603458)And with little = 'a' * 31:
x libc_large
+ musl_large
+------------------------------------------------------------------------------+
| x + |
| x + |
| x + |
| x + |
| x ++ |
| xxx ++ + x +|
|||______M___AM______A__|________________| |
+------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 10 285 3916 343.5 710.5 1127.514
+ 10 844 7236 858.5 1521.4 2009.3719
No difference proven at 95.0% confidencelittle = "aaa"
x libc_large
+ musl_large
+------------------------------------------------------------------------------+
| + x |
| + x |
| ++x |
| ++x |
| ++x+x x +|
||__|_________M_M____AA________________|_| |
+------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 10 98 1831 99.5 281.1 545.09988
+ 10 54 2030 60.5 264.2 621.00238
No difference proven at 95.0% confidencelittle = "aab"
x libc_large
+ musl_large
+------------------------------------------------------------------------------+
| + |
|+ ++ |
|+ ++ x x |
|+ ++ x xx xxxxx|
| |A| |___AM__| |
+------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 10 770661 883471 851452.5 839097.2 38416.116
+ 10 196720 223935 218687 212775.1 11295.819
Difference at 95.0% confidence
-626322 +/- 26603.9
-74.6424% +/- 3.17054%
(Student's t, pooled s = 28314.2)