Page MenuHomeFreeBSD

Add array bound checking mechanism for _ctype.h
AbandonedPublic

Authored by aokblast on Oct 5 2023, 11:02 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, May 9, 7:27 AM
Unknown Object (File)
Jan 5 2024, 10:32 PM
Unknown Object (File)
Dec 27 2023, 2:19 AM
Unknown Object (File)
Dec 25 2023, 5:20 PM
Unknown Object (File)
Dec 20 2023, 8:03 AM
Unknown Object (File)
Dec 8 2023, 12:06 AM
Unknown Object (File)
Dec 2 2023, 6:32 PM
Unknown Object (File)
Oct 10 2023, 4:58 PM
Subscribers

Details

Reviewers
lwhsu
emaste
markj
Summary

This patch is submitted when I was looking at the llvm fsanitize bug on FreeBSD.
At that moment, I was checking if there are some error in libc that may cause this error and discover that some _ctype function are called as macro and thus is without bound_checking.

Take the following function as an exmaple:

#define isprint(c)      __sbistype((c), _CTYPE_R)

And it will finally call

static __inline int
__maskrune(__ct_rune_t _c, unsigned long _f)
{
        return ((_c < 0 || _c >= _CACHED_RUNES) ? ___runetype(_c) :
                _CurrentRuneLocale->__runetype[_c]) & _f;
}

In this case, The _ct_rune_t type is integer type and the length of __runetype is only 256.
Thus I make a bitmask as a boundary checking on the function which may have this issue.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 53847
Build 50738: arc lint + arc unit

Event Timeline

aokblast added a reviewer: lwhsu.
aokblast edited the summary of this revision. (Show Details)

I don't understand this change or your example. The check (_c < 0 || _c >= _CACHED_RUNES) ensures that c is between 0 and 256, so the patch shouldn't change anything. What am I missing?

Looking at the LLVM bug report, I suspect that the access of _CurrentRuneLocale->__runetype is what's triggering the report. The __runtype array is initialized by libc, but if libc.so wasn't compiled with memory-sanitizer enabled, how will the MSan runtime know about the initialization?

Sorry about my careless, I copy the wrong code segment.

The isprint will finally call this function

static __inline int
__sbmaskrune(__ct_rune_t _c, unsigned long _f)
{
	return (_c < 0 || _c >= __mb_sb_limit) ? 0 :
	       _CurrentRuneLocale->__runetype[_c] & _f;
}

Which mb_sb_limt has no direct association with _CACHED_RUNES (size of runetype) in here.
Of course, this patch is not aim at solving the github issue.
It just a extra discovery that I think it may cause a bug in some condition.

Which mb_sb_limt has no direct association with _CACHED_RUNES (size of runetype) in here.
Of course, this patch is not aim at solving the github issue.
It just a extra discovery that I think it may cause a bug in some condition.

In libc we have int __mb_sb_limit = 256; /* Expected to be <= _CACHED_RUNES */.

Oh, I found the comment in that file. Sorry about wasting your time. I close the revision right now and on the other hand checking the code on llvm-project that may cause the issue. Thanks you.

Oh, I found the comment in that file. Sorry about wasting your time. I close the revision right now and on the other hand checking the code on llvm-project that may cause the issue. Thanks you.

I suspect that the real problem is that llvm's libcompiler-rt fails to intercept calls to isprint() on FreeBSD. If I compile the example and disassemble main, I see:

0x00000000000996ed <+93>:    xor    %rcx,%rax
0x00000000000996f0 <+96>:    movabs $0x100000000000,%rcx
0x00000000000996fa <+106>:   add    %rcx,%rax
0x00000000000996fd <+109>:   movl   $0x0,(%rax)
0x0000000000099703 <+115>:   movl   $0x0,-0x4(%rbp)
0x000000000009970a <+122>:   xor    %edi,%edi
0x000000000009970c <+124>:   lea    -0x83e2a(%rip),%rsi        # 0x158e9
0x0000000000099713 <+131>:   call   0x4eb90 <__interceptor_setlocale(int, char*)>
0x0000000000099718 <+136>:   mov    $0x7f,%edi
0x000000000009971d <+141>:   mov    $0x40000,%esi
0x0000000000099722 <+146>:   call   0x99730 <__sbistype>
0x0000000000099727 <+151>:   add    $0x10,%rsp
0x000000000009972b <+155>:   pop    %rbp
0x000000000009972c <+156>:   ret

so the setlocale() call is getting intercepted, but isprint() is not.

Yes, isprint is a macro in FreeBSD's libc, so it will disappear after c pre-processor.
But I don't think this is the reason why it doesn't behavior correctly. However, I will still try to build libc with sanitizer and try again and I am still checking the codes in contrib/llvm-project/compiler-rt/msan at the same time.