Page MenuHomeFreeBSD

Fix 'lock "vm reserv" already initialized' panic.
ClosedPublic

Authored by ray on Aug 15 2019, 1:31 PM.

Details

Summary

test (paddr + VM_LEVEL_0_SIZE <= seg->end) works incorrect in case
when machine is 32bits and memory segment ends at top of address space.
0xfff00000 + 0x100000 <= 0xfff00000 is true, but have to be last one.
Suggested by: kib

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Add Michal, to have one more test.

kib added inline comments.
sys/vm/vm_reserv.c
319 ↗(On Diff #60828)

() around minus are not needed

This revision is now accepted and ready to land.Aug 15 2019, 2:16 PM

Remove extra parentheses.

This revision now requires review to proceed.Aug 15 2019, 2:31 PM
ray marked an inline comment as done.Aug 15 2019, 2:32 PM
ray added inline comments.
sys/vm/vm_reserv.c
319 ↗(On Diff #60828)

done

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

ray marked an inline comment as done.Aug 15 2019, 3:02 PM

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

only if you want to add some really small seg at 0x00000000 :)

In D21272#462545, @ray wrote:

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

only if you want to add some really small seg at 0x00000000 :)

Why does it have to be at 0x00000000? With the segment below we will test 0x100000 <= 0x9d000 - 0x100000, and the variables are unsigned.

SEGMENT 0:  
                                   
start:     0x10000                 
end:       0x9d000
domain:    0                       
free list: 0xffffffff81233cb0
In D21272#462545, @ray wrote:

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

only if you want to add some really small seg at 0x00000000 :)

Why does it have to be at 0x00000000? With the segment below we will test 0x100000 <= 0x9d000 - 0x100000, and the variables are unsigned.

SEGMENT 0:  
                                   
start:     0x10000                 
end:       0x9d000
domain:    0                       
free list: 0xffffffff81233cb0

no, start and end will be truncated and rounded up.
So you will get segment 0x00000000 with size 0x00100000.
First iteration will look that:
0x000000 <= 0x100000 - 0x100000, which is true and you will initialize one super page.

In D21272#462558, @ray wrote:
In D21272#462545, @ray wrote:

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

only if you want to add some really small seg at 0x00000000 :)

Why does it have to be at 0x00000000? With the segment below we will test 0x100000 <= 0x9d000 - 0x100000, and the variables are unsigned.

SEGMENT 0:  
                                   
start:     0x10000                 
end:       0x9d000
domain:    0                       
free list: 0xffffffff81233cb0

no, start and end will be truncated and rounded up.
So you will get segment 0x00000000 with size 0x00100000.
First iteration will look that:
0x000000 <= 0x100000 - 0x100000, which is true and you will initialize one super page.

It might be best to use the 'minus' method of checking only when paddr + VM_LEVEL_0_SIZE overflows. I.e. check padd + VM_LEVEL_0_SIZE < paddr, it is true when overflow happens. Then use the 'minus' method, otherwise keep the current arithmetic.

In D21272#462558, @ray wrote:
In D21272#462545, @ray wrote:

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

only if you want to add some really small seg at 0x00000000 :)

Why does it have to be at 0x00000000? With the segment below we will test 0x100000 <= 0x9d000 - 0x100000, and the variables are unsigned.

SEGMENT 0:  
                                   
start:     0x10000                 
end:       0x9d000
domain:    0                       
free list: 0xffffffff81233cb0

no, start and end will be truncated and rounded up.

end does not get truncated.

So you will get segment 0x00000000 with size 0x00100000.
First iteration will look that:
0x000000 <= 0x100000 - 0x100000, which is true and you will initialize one super page.

We should not initialize a reservation for the segment I pasted above, since it does not
span an aligned 2MB range of physical memory. I tested the patch and my amd64 test
system panicked early during boot.

In D21272#462560, @kib wrote:

It might be best to use the 'minus' method of checking only when paddr + VM_LEVEL_0_SIZE overflows. I.e. check padd + VM_LEVEL_0_SIZE < paddr, it is true when overflow happens. Then use the 'minus' method, otherwise keep the current arithmetic.

Can't we simply test paddr < rounddown2(seg->end, VM_LEVEL_0_SIZE)?

What about that version? :)

Or even (intmax_t), for this code not to make problems 10-20 years later :)

Can't we simply test paddr < rounddown2(seg->end, VM_LEVEL_0_SIZE)?

Is seg->end 0xffffffff or 0 for the problematic case ? Anyway, I prefer the explicit check for overflow, it makes the things absolutely clear for reader.

uint64_t cast is no-go of course.

In D21272#462874, @kib wrote:

Can't we simply test paddr < rounddown2(seg->end, VM_LEVEL_0_SIZE)?

Is seg->end 0xffffffff or 0 for the problematic case ?

No, it is 0xfff00000. But problem with paddr, which is 32bits and paddr + 0x100000 will be 0x00000000.
It satisfy check:
0xffe00000 <= 0xfff00000 - true
0xfff00000 <= 0xfff00000 - true
0x00000000 <= 0xfff00000 - true

BTW, I set MTX_NEW to mtx_init for test, and it just go around.

Anyway, I prefer the explicit check for overflow, it makes the things absolutely clear for reader.

uint64_t cast is no-go of course.

cast to uint64_t will do nothing on 64bits machines and will have impact on 32bits machines only.
cast to intmax_t will add "double-math" for all, but save code from modifications when 64->128 era begin :)
and since it is affect only boot process, think it acceptable.

IMO, Extra check will be workaround, but possible too. Like:

if (paddr + VM_LEVEL_0_SIZE == 0)
        break;
In D21272#462891, @ray wrote:
In D21272#462874, @kib wrote:

Can't we simply test paddr < rounddown2(seg->end, VM_LEVEL_0_SIZE)?

Is seg->end 0xffffffff or 0 for the problematic case ?

No, it is 0xfff00000. But problem with paddr, which is 32bits and paddr + 0x100000 will be 0x00000000.
It satisfy check:
0xffe00000 <= 0xfff00000 - true
0xfff00000 <= 0xfff00000 - true
0x00000000 <= 0xfff00000 - true

BTW, I set MTX_NEW to mtx_init for test, and it just go around.

Anyway, I prefer the explicit check for overflow, it makes the things absolutely clear for reader.

uint64_t cast is no-go of course.

cast to uint64_t will do nothing on 64bits machines and will have impact on 32bits machines only.
cast to intmax_t will add "double-math" for all, but save code from modifications when 64->128 era begin :)
and since it is affect only boot process, think it acceptable.

vM is coded using (somewhat) abstracted types. We do not assume that everything is ILP32 or LP64, at least not in places where it is easy to avoid and which would create one more portability bug for future system porting.

IMO, Extra check will be workaround, but possible too. Like:

if (paddr + VM_LEVEL_0_SIZE == 0)
        break;

Canonical way to check for overflow is

if (paddr + VM_LEVEL_0_SIZE < paddr)

It is also more correct in fact, if paddr is not superpage-aligned.

Check paddr for overflow.

Check paddr for overflow.

In D21272#462874, @kib wrote:

Can't we simply test paddr < rounddown2(seg->end, VM_LEVEL_0_SIZE)?

Is seg->end 0xffffffff or 0 for the problematic case ? Anyway, I prefer the explicit check for overflow, it makes the things absolutely clear for reader.

I think having a seg with end 0 would break some invariants. For instance, _vm_phys_create_seg() assumes that segments are sorted by their end address. seg->start and end must be page-aligned, so this means that we cannot include the last page of physical memory in a segment.

uint64_t cast is no-go of course.

A uint64_t cast would also only fix the problem for 32-bit platforms.

This revision is now accepted and ready to land.Aug 16 2019, 2:26 PM

I noted on IRC that roundup2 on line 1058 might result in overflow as well, at least theoretically.

This revision was automatically updated to reflect the committed changes.