Page MenuHomeFreeBSD

[mips] fix up the assembly generation of unaligned exception loads
ClosedPublic

Authored by adrian on May 28 2020, 8:01 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Nov 12, 10:03 PM
Unknown Object (File)
Sun, Nov 10, 7:29 AM
Unknown Object (File)
Sep 18 2024, 9:37 AM
Unknown Object (File)
Sep 12 2024, 2:58 PM
Unknown Object (File)
Sep 7 2024, 3:12 PM
Unknown Object (File)
Aug 22 2024, 4:19 PM
Unknown Object (File)
Aug 20 2024, 4:46 AM
Unknown Object (File)
Jul 31 2024, 6:55 AM
Subscribers

Details

Summary

I noticed that unaligned accesses were returning garbage values.

Give test data like this:

char testdata[] = { 0x12, 0x34, 0x56, 0x78, 0x9a, 0xbc, 0xde, 0xf1, 0x23, 0x45, 0x67, 0x89, 0xab, 0xcd, 0xef, 0x5a };

Iterating through uint32_t space 1 byte at a time should
look like this:

freebsd-carambola2:/mnt# ./test
Hello, world!
offset 0 pointer 0x410b00 value 0x12345678 0x12345678
offset 1 pointer 0x410b01 value 0x3456789a 0x3456789a
offset 2 pointer 0x410b02 value 0x56789abc 0x56789abc
offset 3 pointer 0x410b03 value 0x789abcde 0x789abcde
offset 4 pointer 0x410b04 value 0x9abcdef1 0x9abcdef1
offset 5 pointer 0x410b05 value 0xbcdef123 0xbcdef123
offset 6 pointer 0x410b06 value 0xdef12345 0xdef12345
offset 7 pointer 0x410b07 value 0xf1234567 0xf1234567

.. but to begin with it looked like this:

offset 0 value 0x12345678
offset 1 value 0x00410a9a
offset 2 value 0x00419abc
offset 3 value 0x009abcde
offset 4 value 0x9abcdef1
offset 5 value 0x00410a23
offset 6 value 0x00412345
offset 7 value 0x00234567

The amusing reason? The compiler is generating the lwr/lwl incorrectly.
Here's an example after I tried to replace the two macros with a single
invocation and offset, rather than having the compiler compile in addiu
to s3 - but the bug is the same:

1044: 8a620003 lwl v0,0(s3)
1048: 9a730000 lwr s3,3(s3)

.. which is just totally trashy and wrong.

This explicitly tells the compiler to treat the output as being read
and written to, which is what lwl/lwr does with the destination
register.

I think a subsequent commit should unify these macros to skip an addiu,
but that can be a later commit.

Test Plan
  • tested on mips24k (ar9331), gcc-6 (but yes, happens on later gccs too)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

It strikes me that in some ways what you really want is to have the pair of instructions as a single function instead. Something like:

static __inline int32_t
lw_unaligned(void *addr)
{
    uint32_t value;

    __asm __volatile("lwl %0, %1\n"
                     "lwr %0, %2\n" : "=r" (value)
                     : "m" (addr), "m" ((char *)addr + 3));
    return (value);
}

For OP_LWU you just make sure you cast the value to uint32_t before assigning it to the target register to zero-extend. The current patch (modulo the change to lb_macro) seems fine to me though as more of a quick fix / band-aid.

Of course, this entire function is utterly broken (a bit par for the course quality-wise for sys/mips unfortunately). You can't do user accesses as bare loads and stores since they could fault. useracc() doesn't really prevent that at all and is a useless test and racey. What you really should do is write more copyin/copyout type functions that are wrappers around the two instruction sequence above that you can call like fueword_unaligned and 'suword_unaligned`.

sys/mips/mips/trap.c
112 ↗(On Diff #72374)

lb should probably stay as '=r' as it only writes to the output register?

In D25040#551533, @jhb wrote:

It strikes me that in some ways what you really want is to have the pair of instructions as a single function instead. Something like:

static __inline int32_t
lw_unaligned(void *addr)
{
    uint32_t value;

    __asm __volatile("lwl %0, %1\n"
                     "lwr %0, %2\n" : "=r" (value)
                     : "m" (addr), "m" ((char *)addr + 3));
    return (value);
}

For OP_LWU you just make sure you cast the value to uint32_t before assigning it to the target register to zero-extend. The current patch (modulo the change to lb_macro) seems fine to me though as more of a quick fix / band-aid.

It's definitely a bandaid. Right now I want to at least get it working. I've done a local hack to convert ONLY the unaligned LW path to back-to-back assembly instructions inside a single macro and that works as well.

Of course, this entire function is utterly broken (a bit par for the course quality-wise for sys/mips unfortunately). You can't do user accesses as bare loads and stores since they could fault. useracc() doesn't really prevent that at all and is a useless test and racey. What you really should do is write more copyin/copyout type functions that are wrappers around the two instruction sequence above that you can call like fueword_unaligned and 'suword_unaligned`.

I actually first did it with copyin to verify that it wasn't something odd elsewhere - at which point you don't need to use the macros as you're effectively doing a byte by byte copy anyway, no? (does copyin/copyout enforce alignment? I hope not..)

sys/mips/mips/trap.c
112 ↗(On Diff #72374)

I think that's technically correct, but I'm trying to be consistent (and I failed above in lbu)

As long as the compiler doesn't treat the result as something it can toss and thus reassign the input register then it's okay.

Honestly, I'd rather just land this and fix lbu_macro() to be +r too, and then replace it all with copyin/copyout to be "technically correct" in all possible userland address arrangement cases. If I want to tackle unaligned load/stores from /kernel/ memory then we can eventually resurrect all of this and do it right.

This revision was not accepted when it landed; it landed in state Needs Review.May 29 2020, 12:05 AM
This revision was automatically updated to reflect the committed changes.