Page MenuHomeFreeBSD

qsort_r: Standardize on thunk-last API
AbandonedPublic

Authored by cem on Jan 13 2017, 8:07 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Apr 17, 9:03 PM
Unknown Object (File)
Thu, Apr 11, 1:11 PM
Unknown Object (File)
Mar 15 2024, 1:37 PM
Unknown Object (File)
Mar 15 2024, 1:09 PM
Unknown Object (File)
Jan 10 2024, 11:48 AM
Unknown Object (File)
Dec 20 2023, 2:31 AM
Unknown Object (File)
Nov 23 2023, 9:56 AM
Unknown Object (File)
Nov 23 2023, 2:48 AM
Subscribers
None

Details

Summary

Glibc qsort_r takes the 'thunk' parameter in a different argument, and
passes 'thunk' in a different argument to the comparator. This allows
qsort and qsort_r to share an implementation on common architectures,
where the bogus 3rd argument is mostly free and ignored. Also, support
Linux-originated programs and libraries in a straightforward way.
Backwards compatibility for old programs is provided using the GCC
extension __builtin_choose_expr, which Clang supports.

(It seems like C11 _Generic() should support this; however, it
explicitly does not support 'function' types, and it seems that
'function' and 'function pointer' types are incompatible for its
purposes, at least in Clang. Requiring qsort_r(a,b,c,&cmp,t) seems much
less useful to me. Additionally, a few other hacks were explored and
nothing panned out quite as well as using the GCC extension. See the
phabricator review.)

Test Plan

Inspired by basename/dirname compat routines.

$ cc -std=c11 -Wall -Wextra -g -O2   test.c
$ nm test.o | grep qsort
                 U __linux_qsort_r
                 U qsort_r
$ ./a.out
BSD qsort_r:   1 2 3 4 5
Linux qsort_r: 1 2 3 4 5
#include <sys/param.h>

#include <stdio.h>
#include <stdlib.h>

int
bsdcmp(void *thunk __unused, const void *a, const void *b)
{
        int ia = *(const int *)a;
        int ib = *(const int *)b;

        if (ia < ib)
                return (-1);
        else if (ia > ib)
                return (1);
        return (0);
}

int
linuxcmp(const void *a, const void *b, void *thunk __unused)
{
        int ia = *(const int *)a;
        int ib = *(const int *)b;

        if (ia < ib)
                return (-1);
        else if (ia > ib)
                return (1);
        return (0);
}

int
main(int argc, char **argv)
{
        int a[] = {3,5,1,2,4};
        int b[] = {3,5,1,2,4};
        size_t i;

        qsort_r(a, nitems(a), sizeof(a[0]), NULL, bsdcmp);
        qsort_r(b, nitems(b), sizeof(b[0]), linuxcmp, NULL);

        printf("BSD qsort_r:  ");
        for (i = 0; i < nitems(a); i++)
                printf(" %d", a[i]);
        printf("\n");
        printf("Linux qsort_r:");
        for (i = 0; i < nitems(b); i++)
                printf(" %d", b[i]);
        printf("\n");

        return (0);
}

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 6756
Build 6971: arc lint + arc unit

Event Timeline

cem retitled this revision from to qsort_r: Add compatability definition for Linux qsort_r.
cem updated this object.
cem edited the test plan for this revision. (Show Details)
cem added reviewers: ed, jhb, wollman, jilles.

(Add compat routine to Symbol.map)

And no one actually reviews stuff until commit anyway, so if this review is silent I'll just plan to commit it in a few days.

Sounds like a good idea to go in this direction, but I'd actually do this even more aggressively: aim at converging towards glibc's prototype entirely. See this comment I actually wrote more than a year ago in the Austin Group's bug tracker:

http://austingroupbugs.net/view.php?id=900#c2779

Maybe you could structure this similarly to how I've added the basename()/dirname() stuff?

  • Call the new symbol qsort_r@FBSD_1.5.
  • Use __generic() to switch between the two symbols. EDIT: Oh, wait. As you mentioned, _Generic() doesn't work with function types.
  • Only define the macro when not in C++ mode, as C++ code doesn't tend to like macros.
In D9169#189473, @ed wrote:

Sounds like a good idea to go in this direction, but I'd actually do this even more aggressively: aim at converging towards glibc's prototype entirely. See this comment I actually wrote more than a year ago in the Austin Group's bug tracker:

http://austingroupbugs.net/view.php?id=900#c2779

I'm about to head out, I'll take a look when I get back, thanks. If that's the direction we decide to go, I'm fine with the qsort_r@FBSD_1.5 bit.

Maybe you could structure this similarly to how I've added the basename()/dirname() stuff?

  • Call the new symbol qsort_r@FBSD_1.5.
  • Use __generic() to switch between the two symbols. EDIT: Oh, wait. As you mentioned, _Generic() doesn't work with function types.

Right.

  • Only define the macro when not in C++ mode, as C++ code doesn't tend to like macros.

I've got "!defined(__cplusplus)" — do I need something else too?

Thanks!

This comment was removed by ed.

Ah, interesting! As you mentioned, it is indeed the case that Clang doesn't like it if you pass in function objects instead of function pointers. This code:

void banana(void) {
}
int main() {
  _Generic(banana, void (*)(void): 1);
}

Results in:

bla.c:4:12: error: controlling expression type 'void (void)' not compatible with any generic

But you can always do some bogus arithmetic on it to force it to be converted to a pointer type:

void banana(void) {
}
int main() {
  _Generic(banana + 0, void (*)(void): 1);
}

So in this case you could still use __generic() if you want.

This is not a review of the idea to support both qsort_r() variants like this. At this time I can only say it seems somewhat of a dangerous idea.

A quick query at https://codesearch.debian.net/ suggests there is quite a bit of code that tries to detect the two qsort_r() variants. It would not surprise me if some high-profile port broke.

The addition of __linux_qsort_r to the FBSDprivate_1.0 version is wrong since this symbol will be used by user binaries (applications). Instead, __linux_qsort_r should be added to the newest public version, currently FBSD_1.5. The FBSDprivate_1.0 version is for symbols that are used exclusively within FreeBSD itself (unfortunately, we currently only have reviews enforcing this and that most of the FBSDprivate_1.0 symbols are not declared in /usr/include header files).

"Linux" should be replaced with "glibc" here.

With common ABIs, the core of glibc qsort_r() and qsort() may use the same code, where the user qsort() callback routine ignores the extra parameter. It is not certain whether this should actually be taken advantage of.

I like the idea of handling this the same way as basename()/dirname().

Make sure that things like _Generic(banana + 0, void (*)(void): 1); do not generate warnings about the pointer arithmetic on a function pointer. Alternatively, try _Generic(&banana, void (*)(void):... (which needs additional cases).

The test program should check the thunk values and ideally it should be changed to use ATF and stored into lib/libc/tests/stdlib/ (I recommend adding a new test program and leaving the existing qsort_test.c unchanged).

In D9169#189476, @ed wrote:

But you can always do some bogus arithmetic on it to force it to be converted to a pointer type:

void banana(void) {
}
int main() {
  _Generic(banana + 0, void (*)(void): 1);
}

Huh, it surprises me that that's valid. But I guess it is consistent. I attempted:

_Generic(&banana, void (*)(void): ...)

But that trips on rvalues like NULL (passed as the thunk for the other-style qsort_r).

So in this case you could still use __generic() if you want.

Sure, that seems preferable.

This is not a review of the idea to support both qsort_r() variants like this. At this time I can only say it seems somewhat of a dangerous idea.

A quick query at https://codesearch.debian.net/ suggests there is quite a bit of code that tries to detect the two qsort_r() variants. It would not surprise me if some high-profile port broke.

Hm, it would surprise me. It should continue to detect at least one variant as working, and using either variant is fine.

The addition of __linux_qsort_r to the FBSDprivate_1.0 version is wrong since this symbol will be used by user binaries (applications). Instead, __linux_qsort_r should be added to the newest public version, currently FBSD_1.5. The FBSDprivate_1.0 version is for symbols that are used exclusively within FreeBSD itself (unfortunately, we currently only have reviews enforcing this and that most of the FBSDprivate_1.0 symbols are not declared in /usr/include header files).

Ok.

"Linux" should be replaced with "glibc" here.

Ok.

With common ABIs, the core of glibc qsort_r() and qsort() may use the same code, where the user qsort() callback routine ignores the extra parameter. It is not certain whether this should actually be taken advantage of.

It seems like a microoptimization which could be pursued a standalone change.

I like the idea of handling this the same way as basename()/dirname().

Make sure that things like _Generic(banana + 0, void (*)(void): 1); do not generate warnings about the pointer arithmetic on a function pointer.

Will do (before changing to __generic()). If it doesn't work, it can stay GCC-style.

Alternatively, try _Generic(&banana, void (*)(void):... (which needs additional cases).

This doesn't work (see above) because not all thunk values can have their address taken.

The test program should check the thunk values

What do you mean?

and ideally it should be changed to use ATF and stored into lib/libc/tests/stdlib/ (I recommend adding a new test program and leaving the existing qsort_test.c unchanged).

Sure, that's fine.

It seems that the + 0 hack isn't feasible due to additional warnings:

test.c:39:2: warning: arithmetic on a pointer to void is a GNU extension [-Wpointer-arith]
        qsort_r(a, nitems(a), sizeof(a[0]), NULL, bsdcmp);
        ^                                   ~~~~
/usr/include/stdlib.h:329:14: note: expanded from macro 'qsort_r'
        __generic(A + 0, __glibc_qsort_cmp_t *, __glibc_qsort_r, qsort_r)( \
                    ^
/usr/include/sys/cdefs.h:337:11: note: expanded from macro '__generic'
        _Generic(expr, t: yes, default: no)
                 ^
test.c:40:2: warning: arithmetic on a pointer to the function type 'int (const void *, const void *, void *)' is a GNU extension [-Wpointer-arith]
        qsort_r(b, nitems(b), sizeof(b[0]), linuxcmp, NULL);
        ^                                   ~~~~~~~~
/usr/include/stdlib.h:329:14: note: expanded from macro 'qsort_r'
        __generic(A + 0, __glibc_qsort_cmp_t *, __glibc_qsort_r, qsort_r)( \
                    ^
/usr/include/sys/cdefs.h:337:11: note: expanded from macro '__generic'
        _Generic(expr, t: yes, default: no)
                 ^
cem edited edge metadata.
  • Switch to glibc-style qsort as the new default (@FBSD_1.5)
  • Rename linux_qsort_r to qsort_r and old qsort_r to freebsd11_qsort_r
    • (Refer to it as __old_qsort_r in stdlib.h, like basename/dirname)

TODO: Convert test program to ATF.

cem retitled this revision from qsort_r: Add compatability definition for Linux qsort_r to qsort_r: Standardize on thunk-last API.Jan 14 2017, 12:34 AM
cem updated this object.
cem added inline comments.
lib/libc/stdlib/qsort.c
214

s/dirname/qsort_r/. Will be fixed in the next patch.

In D9169#189490, @cem wrote:

A quick query at https://codesearch.debian.net/ suggests there is quite a bit of code that tries to detect the two qsort_r() variants. It would not surprise me if some high-profile port broke.

Hm, it would surprise me. It should continue to detect at least one variant as working, and using either variant is fine.

At least notifying portmgr when you do the commit would be nice, so they are not surprised completely.

The test program should check the thunk values

What do you mean?

Pass some unique value as thunk and verify it in the callback function.

However, this does not make your existing tests with NULL thunks redundant, since some implementation of the selection mechanism might not work correctly because 0 is a valid null pointer constant for a function pointer type as well.

At least notifying portmgr when you do the commit would be nice, so they are not surprised completely.

I'll plan to do an exp-run first.

Pass some unique value as thunk and verify it in the callback function.

However, this does not make your existing tests with NULL thunks redundant, since some implementation of the selection mechanism might not work correctly because 0 is a valid null pointer constant for a function pointer type as well.

I'm having trouble imagining why this change would cause the wrong thunk value to be passed. If the order of thunk and callback are swapped, the program will attempt to call data, which should trigger an access violation (or NULL, which will trigger a segfault).

In D9169#189621, @cem wrote:

I'll plan to do an exp-run first.

Would it in that case make sense to also experiment with switching over to the glibc flavour entirely? Considering that this function isn't used that often and most existing users outside of the base system already have auto-detection in place, maybe there is no need to provide any compile-time compatibility.

In D9169#189872, @ed wrote:
In D9169#189621, @cem wrote:

I'll plan to do an exp-run first.

Would it in that case make sense to also experiment with switching over to the glibc flavour entirely? Considering that this function isn't used that often and most existing users outside of the base system already have auto-detection in place, maybe there is no need to provide any compile-time compatibility.

I wouldn't be too surprised if there are some BSD-only ports that that breaks. But I guess we could try it and find out!

In D9169#190294, @cem wrote:

Exp-run request for a clean switch to thunk-last here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216157

exp-run results are in.

At least libgit2 doesn't detect qsort_r API, it just hardcodes BSD==thunk-first:

https://github.com/libgit2/libgit2/blob/master/src/util.c#L632

pecl-qb also just checks for BSD and hardcodes from there:

(Warning: HUGE) https://raw.githubusercontent.com/chung-leong/qb/master/qb_interpreter_functions.c

Several ports fail to compile (better than run-time crash) but similarly hardcode:

https://github.com/hselasky/midipp/blob/master/midipp_database.cpp#L344
https://github.com/SWI-Prolog/swipl-devel/blob/master/src/pl-dict.c#L126-L161
https://github.com/hashcat/hashcat/blob/master/include/sort_r.h#L142-L146 (umm)
https://github.com/dajobe/raptor/blob/master/src/sort_r.h#L63-L68 (is this header copy/pasted around?)

So I think we need the compatibility shim.

In D9169#190948, @cem wrote:

So I think we need the compatibility shim.

Well, even though the list you've provided is non-empty, it doesn't look too shocking, right? If it's just those <10 ports, we could consider bumping __FreeBSD_version__ and have some local patches there.

EDIT: Sorry. I pressed some keyboard shortcut that made Phabricator submit my draft?

In D9169#191125, @ed wrote:

Well, even though the list you've provided is non-empty, it doesn't look too shocking, right? If it's just those <10 ports, we could consider bumping __FreeBSD_version__ and have some local patches there.

Well, it's those handful, plus any in the 652 skipped ports, plus any number of potential runtime failures:

For the ports that build successfully but disable warnings, I can't think of a way to detect them.

The compatibility definitions don't seem to help ports very much; the same four ports fail to build:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216157#c8