Page MenuHomeFreeBSD

libc: locale: fix some assumptions that wchar_t cannot be signed
Needs ReviewPublic

Authored by kevans on Jun 7 2025, 2:45 AM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, May 11, 2:11 PM
Unknown Object (File)
Wed, May 6, 11:49 PM
Unknown Object (File)
Wed, May 6, 11:02 AM
Unknown Object (File)
Wed, May 6, 9:08 AM
Unknown Object (File)
Sun, May 3, 11:44 PM
Unknown Object (File)
Apr 10 2026, 12:37 AM
Unknown Object (File)
Apr 5 2026, 7:31 PM
Unknown Object (File)
Apr 3 2026, 1:06 AM
Subscribers

Details

Reviewers
markj
bapt
Summary

wchar_t signedness is platform-dependant, and it's actually signed on
powerpc, riscv and x86. We make some assumptions in Libc that they
aren't negative, which could lead to some invalid accesses.

Ensure that comparisons for <= UCHAR_MAX also confirm that the value is
positive, and in largesearch() let's cast it to an unsigned value in
case we actually do have a character mapped but it's just unfortunately
out of range for our signed wchar_t. We still have a chance of doing
the right thing.

Sponsored by: Klara, Inc.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 71106
Build 67989: arc lint + arc unit

Event Timeline

kevans requested review of this revision.Jun 7 2025, 2:45 AM
lib/libc/locale/collate.c
303

Why is this cast needed?

lib/libc/locale/collate.c
303

Well, 2026 Kyle doesn't remember. Reasoning through it again, I *think* my logic was that weights are explicitly unsigned in Unicode data, so naturally this should be coerced to unsigned. With fresh eyes, that doesn't make any amount of sense unless wchar_t was 16-bit signed (to avoid sign extension bugs), so I'll drop this.