Page MenuHomeFreeBSD

qsort_r: Standardize on thunk-last API
AbandonedPublic

Authored by cem on Jan 13 2017, 8:07 PM.

Details

Summary

Glibc qsort_r takes the 'thunk' parameter in a different argument, and
passes 'thunk' in a different argument to the comparator. This allows
qsort and qsort_r to share an implementation on common architectures,
where the bogus 3rd argument is mostly free and ignored. Also, support
Linux-originated programs and libraries in a straightforward way.
Backwards compatibility for old programs is provided using the GCC
extension __builtin_choose_expr, which Clang supports.

(It seems like C11 _Generic() should support this; however, it
explicitly does not support 'function' types, and it seems that
'function' and 'function pointer' types are incompatible for its
purposes, at least in Clang. Requiring qsort_r(a,b,c,&cmp,t) seems much
less useful to me. Additionally, a few other hacks were explored and
nothing panned out quite as well as using the GCC extension. See the
phabricator review.)

Test Plan

Inspired by basename/dirname compat routines.

$ cc -std=c11 -Wall -Wextra -g -O2   test.c
$ nm test.o | grep qsort
                 U __linux_qsort_r
                 U qsort_r
$ ./a.out
BSD qsort_r:   1 2 3 4 5
Linux qsort_r: 1 2 3 4 5
#include <sys/param.h>

#include <stdio.h>
#include <stdlib.h>

int
bsdcmp(void *thunk __unused, const void *a, const void *b)
{
        int ia = *(const int *)a;
        int ib = *(const int *)b;

        if (ia < ib)
                return (-1);
        else if (ia > ib)
                return (1);
        return (0);
}

int
linuxcmp(const void *a, const void *b, void *thunk __unused)
{
        int ia = *(const int *)a;
        int ib = *(const int *)b;

        if (ia < ib)
                return (-1);
        else if (ia > ib)
                return (1);
        return (0);
}

int
main(int argc, char **argv)
{
        int a[] = {3,5,1,2,4};
        int b[] = {3,5,1,2,4};
        size_t i;

        qsort_r(a, nitems(a), sizeof(a[0]), NULL, bsdcmp);
        qsort_r(b, nitems(b), sizeof(b[0]), linuxcmp, NULL);

        printf("BSD qsort_r:  ");
        for (i = 0; i < nitems(a); i++)
                printf(" %d", a[i]);
        printf("\n");
        printf("Linux qsort_r:");
        for (i = 0; i < nitems(b); i++)
                printf(" %d", b[i]);
        printf("\n");

        return (0);
}

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 6756
Build 6971: arc lint + arc unit

Event Timeline

cem retitled this revision from to qsort_r: Add compatability definition for Linux qsort_r.Jan 13 2017, 8:07 PM
cem updated this object.
cem edited the test plan for this revision. (Show Details)
cem added reviewers: ed, jhb, wollman, jilles.
cem updated this revision to Diff 23972.
cem updated this revision to Diff 23973.Jan 13 2017, 8:11 PM

(Add compat routine to Symbol.map)

cem added a comment.Jan 13 2017, 8:33 PM

And no one actually reviews stuff until commit anyway, so if this review is silent I'll just plan to commit it in a few days.

ed edited edge metadata.EditedJan 13 2017, 8:50 PM

Sounds like a good idea to go in this direction, but I'd actually do this even more aggressively: aim at converging towards glibc's prototype entirely. See this comment I actually wrote more than a year ago in the Austin Group's bug tracker:

http://austingroupbugs.net/view.php?id=900#c2779

Maybe you could structure this similarly to how I've added the basename()/dirname() stuff?

  • Call the new symbol qsort_r@FBSD_1.5.
  • Use __generic() to switch between the two symbols. EDIT: Oh, wait. As you mentioned, _Generic() doesn't work with function types.
  • Only define the macro when not in C++ mode, as C++ code doesn't tend to like macros.
cem added a comment.Jan 13 2017, 8:55 PM
In D9169#189473, @ed wrote:

Sounds like a good idea to go in this direction, but I'd actually do this even more aggressively: aim at converging towards glibc's prototype entirely. See this comment I actually wrote more than a year ago in the Austin Group's bug tracker:
http://austingroupbugs.net/view.php?id=900#c2779

I'm about to head out, I'll take a look when I get back, thanks. If that's the direction we decide to go, I'm fine with the qsort_r@FBSD_1.5 bit.

Maybe you could structure this similarly to how I've added the basename()/dirname() stuff?

  • Call the new symbol qsort_r@FBSD_1.5.
  • Use __generic() to switch between the two symbols. EDIT: Oh, wait. As you mentioned, _Generic() doesn't work with function types.

Right.

  • Only define the macro when not in C++ mode, as C++ code doesn't tend to like macros.

I've got "!defined(__cplusplus)" — do I need something else too?

Thanks!

ed added a comment.Jan 13 2017, 9:01 PM
This comment was removed by ed.
ed added a comment.Jan 13 2017, 9:01 PM

Ah, interesting! As you mentioned, it is indeed the case that Clang doesn't like it if you pass in function objects instead of function pointers. This code:

void banana(void) {
}
int main() {
  _Generic(banana, void (*)(void): 1);
}

Results in:

bla.c:4:12: error: controlling expression type 'void (void)' not compatible with any generic

But you can always do some bogus arithmetic on it to force it to be converted to a pointer type:

void banana(void) {
}
int main() {
  _Generic(banana + 0, void (*)(void): 1);
}

So in this case you could still use __generic() if you want.

jilles edited edge metadata.Jan 13 2017, 9:02 PM

This is not a review of the idea to support both qsort_r() variants like this. At this time I can only say it seems somewhat of a dangerous idea.

A quick query at https://codesearch.debian.net/ suggests there is quite a bit of code that tries to detect the two qsort_r() variants. It would not surprise me if some high-profile port broke.

The addition of __linux_qsort_r to the FBSDprivate_1.0 version is wrong since this symbol will be used by user binaries (applications). Instead, __linux_qsort_r should be added to the newest public version, currently FBSD_1.5. The FBSDprivate_1.0 version is for symbols that are used exclusively within FreeBSD itself (unfortunately, we currently only have reviews enforcing this and that most of the FBSDprivate_1.0 symbols are not declared in /usr/include header files).

"Linux" should be replaced with "glibc" here.

With common ABIs, the core of glibc qsort_r() and qsort() may use the same code, where the user qsort() callback routine ignores the extra parameter. It is not certain whether this should actually be taken advantage of.

I like the idea of handling this the same way as basename()/dirname().

Make sure that things like _Generic(banana + 0, void (*)(void): 1); do not generate warnings about the pointer arithmetic on a function pointer. Alternatively, try _Generic(&banana, void (*)(void):... (which needs additional cases).

The test program should check the thunk values and ideally it should be changed to use ATF and stored into lib/libc/tests/stdlib/ (I recommend adding a new test program and leaving the existing qsort_test.c unchanged).

cem added a comment.Jan 13 2017, 9:59 PM
In D9169#189476, @ed wrote:

But you can always do some bogus arithmetic on it to force it to be converted to a pointer type:

void banana(void) {
}
int main() {
  _Generic(banana + 0, void (*)(void): 1);
}

Huh, it surprises me that that's valid. But I guess it is consistent. I attempted:

_Generic(&banana, void (*)(void): ...)

But that trips on rvalues like NULL (passed as the thunk for the other-style qsort_r).

So in this case you could still use __generic() if you want.

Sure, that seems preferable.

This is not a review of the idea to support both qsort_r() variants like this. At this time I can only say it seems somewhat of a dangerous idea.
A quick query at https://codesearch.debian.net/ suggests there is quite a bit of code that tries to detect the two qsort_r() variants. It would not surprise me if some high-profile port broke.

Hm, it would surprise me. It should continue to detect at least one variant as working, and using either variant is fine.

The addition of __linux_qsort_r to the FBSDprivate_1.0 version is wrong since this symbol will be used by user binaries (applications). Instead, __linux_qsort_r should be added to the newest public version, currently FBSD_1.5. The FBSDprivate_1.0 version is for symbols that are used exclusively within FreeBSD itself (unfortunately, we currently only have reviews enforcing this and that most of the FBSDprivate_1.0 symbols are not declared in /usr/include header files).

Ok.

"Linux" should be replaced with "glibc" here.

Ok.

With common ABIs, the core of glibc qsort_r() and qsort() may use the same code, where the user qsort() callback routine ignores the extra parameter. It is not certain whether this should actually be taken advantage of.

It seems like a microoptimization which could be pursued a standalone change.

I like the idea of handling this the same way as basename()/dirname().
Make sure that things like _Generic(banana + 0, void (*)(void): 1); do not generate warnings about the pointer arithmetic on a function pointer.

Will do (before changing to __generic()). If it doesn't work, it can stay GCC-style.

Alternatively, try _Generic(&banana, void (*)(void):... (which needs additional cases).

This doesn't work (see above) because not all thunk values can have their address taken.

The test program should check the thunk values

What do you mean?

and ideally it should be changed to use ATF and stored into lib/libc/tests/stdlib/ (I recommend adding a new test program and leaving the existing qsort_test.c unchanged).

Sure, that's fine.

cem added a comment.Jan 13 2017, 10:07 PM

It seems that the + 0 hack isn't feasible due to additional warnings:

test.c:39:2: warning: arithmetic on a pointer to void is a GNU extension [-Wpointer-arith]
        qsort_r(a, nitems(a), sizeof(a[0]), NULL, bsdcmp);
        ^                                   ~~~~
/usr/include/stdlib.h:329:14: note: expanded from macro 'qsort_r'
        __generic(A + 0, __glibc_qsort_cmp_t *, __glibc_qsort_r, qsort_r)( \
                    ^
/usr/include/sys/cdefs.h:337:11: note: expanded from macro '__generic'
        _Generic(expr, t: yes, default: no)
                 ^
test.c:40:2: warning: arithmetic on a pointer to the function type 'int (const void *, const void *, void *)' is a GNU extension [-Wpointer-arith]
        qsort_r(b, nitems(b), sizeof(b[0]), linuxcmp, NULL);
        ^                                   ~~~~~~~~
/usr/include/stdlib.h:329:14: note: expanded from macro 'qsort_r'
        __generic(A + 0, __glibc_qsort_cmp_t *, __glibc_qsort_r, qsort_r)( \
                    ^
/usr/include/sys/cdefs.h:337:11: note: expanded from macro '__generic'
        _Generic(expr, t: yes, default: no)
                 ^
cem edited edge metadata.Jan 13 2017, 10:52 PM
cem updated this revision to Diff 23975.
  • Switch to glibc-style qsort as the new default (@FBSD_1.5)
  • Rename linux_qsort_r to qsort_r and old qsort_r to freebsd11_qsort_r
    • (Refer to it as __old_qsort_r in stdlib.h, like basename/dirname)

TODO: Convert test program to ATF.

cem retitled this revision from qsort_r: Add compatability definition for Linux qsort_r to qsort_r: Standardize on thunk-last API.Jan 14 2017, 12:34 AM
cem updated this object.
cem added inline comments.
lib/libc/stdlib/qsort.c
214

s/dirname/qsort_r/. Will be fixed in the next patch.

In D9169#189490, @cem wrote:

A quick query at https://codesearch.debian.net/ suggests there is quite a bit of code that tries to detect the two qsort_r() variants. It would not surprise me if some high-profile port broke.

Hm, it would surprise me. It should continue to detect at least one variant as working, and using either variant is fine.

At least notifying portmgr when you do the commit would be nice, so they are not surprised completely.

The test program should check the thunk values

What do you mean?

Pass some unique value as thunk and verify it in the callback function.

However, this does not make your existing tests with NULL thunks redundant, since some implementation of the selection mechanism might not work correctly because 0 is a valid null pointer constant for a function pointer type as well.

cem added a comment.Jan 14 2017, 4:36 PM

At least notifying portmgr when you do the commit would be nice, so they are not surprised completely.

I'll plan to do an exp-run first.

Pass some unique value as thunk and verify it in the callback function.
However, this does not make your existing tests with NULL thunks redundant, since some implementation of the selection mechanism might not work correctly because 0 is a valid null pointer constant for a function pointer type as well.

I'm having trouble imagining why this change would cause the wrong thunk value to be passed. If the order of thunk and callback are swapped, the program will attempt to call data, which should trigger an access violation (or NULL, which will trigger a segfault).

ed added a comment.Jan 15 2017, 7:57 AM
In D9169#189621, @cem wrote:

I'll plan to do an exp-run first.

Would it in that case make sense to also experiment with switching over to the glibc flavour entirely? Considering that this function isn't used that often and most existing users outside of the base system already have auto-detection in place, maybe there is no need to provide any compile-time compatibility.

cem added a comment.Jan 15 2017, 8:06 AM
In D9169#189872, @ed wrote:
In D9169#189621, @cem wrote:

I'll plan to do an exp-run first.

Would it in that case make sense to also experiment with switching over to the glibc flavour entirely? Considering that this function isn't used that often and most existing users outside of the base system already have auto-detection in place, maybe there is no need to provide any compile-time compatibility.

I wouldn't be too surprised if there are some BSD-only ports that that breaks. But I guess we could try it and find out!

cem added a comment.Jan 17 2017, 12:58 AM

Exp-run request for a clean switch to thunk-last here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216157

cem added a comment.Jan 18 2017, 9:25 PM
In D9169#190294, @cem wrote:

Exp-run request for a clean switch to thunk-last here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216157

exp-run results are in.

At least libgit2 doesn't detect qsort_r API, it just hardcodes BSD==thunk-first:

https://github.com/libgit2/libgit2/blob/master/src/util.c#L632

pecl-qb also just checks for BSD and hardcodes from there:

(Warning: HUGE) https://raw.githubusercontent.com/chung-leong/qb/master/qb_interpreter_functions.c

Several ports fail to compile (better than run-time crash) but similarly hardcode:

https://github.com/hselasky/midipp/blob/master/midipp_database.cpp#L344
https://github.com/SWI-Prolog/swipl-devel/blob/master/src/pl-dict.c#L126-L161
https://github.com/hashcat/hashcat/blob/master/include/sort_r.h#L142-L146 (umm)
https://github.com/dajobe/raptor/blob/master/src/sort_r.h#L63-L68 (is this header copy/pasted around?)

So I think we need the compatibility shim.

ed added a comment.EditedJan 19 2017, 3:44 PM
In D9169#190948, @cem wrote:

So I think we need the compatibility shim.

Well, even though the list you've provided is non-empty, it doesn't look too shocking, right? If it's just those <10 ports, we could consider bumping __FreeBSD_version__ and have some local patches there.

EDIT: Sorry. I pressed some keyboard shortcut that made Phabricator submit my draft?

cem added a comment.Jan 19 2017, 3:56 PM
In D9169#191125, @ed wrote:

Well, even though the list you've provided is non-empty, it doesn't look too shocking, right? If it's just those <10 ports, we could consider bumping __FreeBSD_version__ and have some local patches there.

Well, it's those handful, plus any in the 652 skipped ports, plus any number of potential runtime failures:

For the ports that build successfully but disable warnings, I can't think of a way to detect them.

cem added a comment.Feb 10 2017, 5:55 PM

The compatibility definitions don't seem to help ports very much; the same four ports fail to build:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216157#c8

cem abandoned this revision.May 15 2017, 4:53 PM