Page MenuHomeFreeBSD

libc: locale: fix some assumptions that wchar_t cannot be signed
Needs ReviewPublic

Authored by kevans on Jun 7 2025, 2:45 AM.
Tags
None
Referenced Files
F146504226: D50733.diff
Tue, Mar 3, 5:18 AM
Unknown Object (File)
Sun, Feb 8, 10:55 AM
Unknown Object (File)
Fri, Feb 6, 9:19 PM
Unknown Object (File)
Jan 31 2026, 5:07 AM
Unknown Object (File)
Jan 26 2026, 2:57 PM
Unknown Object (File)
Jan 17 2026, 8:36 PM
Unknown Object (File)
Nov 18 2025, 6:05 PM
Unknown Object (File)
Sep 14 2025, 4:39 PM
Subscribers

Details

Reviewers
markj
bapt
Summary

wchar_t signedness is platform-dependant, and it's actually signed on
powerpc, riscv and x86. We make some assumptions in Libc that they
aren't negative, which could lead to some invalid accesses.

Ensure that comparisons for <= UCHAR_MAX also confirm that the value is
positive, and in largesearch() let's cast it to an unsigned value in
case we actually do have a character mapped but it's just unfortunately
out of range for our signed wchar_t. We still have a chance of doing
the right thing.

Sponsored by: Klara, Inc.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 71106
Build 67989: arc lint + arc unit

Event Timeline

kevans requested review of this revision.Jun 7 2025, 2:45 AM
lib/libc/locale/collate.c
303

Why is this cast needed?

lib/libc/locale/collate.c
303

Well, 2026 Kyle doesn't remember. Reasoning through it again, I *think* my logic was that weights are explicitly unsigned in Unicode data, so naturally this should be coerced to unsigned. With fresh eyes, that doesn't make any amount of sense unless wchar_t was 16-bit signed (to avoid sign extension bugs), so I'll drop this.