Page MenuHomeFreeBSD

Fix 'lock "vm reserv" already initialized' panic.
ClosedPublic

Authored by ray on Aug 15 2019, 1:31 PM.

Details

Summary

test (paddr + VM_LEVEL_0_SIZE <= seg->end) works incorrect in case
when machine is 32bits and memory segment ends at top of address space.
0xfff00000 + 0x100000 <= 0xfff00000 is true, but have to be last one.
Suggested by: kib

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

ray created this revision.Aug 15 2019, 1:31 PM
ray added a subscriber: mmel.Aug 15 2019, 1:39 PM

Add Michal, to have one more test.

kib accepted this revision.Aug 15 2019, 2:16 PM
kib added inline comments.
sys/vm/vm_reserv.c
319 ↗(On Diff #60828)

() around minus are not needed

This revision is now accepted and ready to land.Aug 15 2019, 2:16 PM
kib added reviewers: alc, dougm.Aug 15 2019, 2:19 PM
ray updated this revision to Diff 60831.Aug 15 2019, 2:31 PM

Remove extra parentheses.

This revision now requires review to proceed.Aug 15 2019, 2:31 PM
ray marked an inline comment as done.Aug 15 2019, 2:32 PM
ray added inline comments.
sys/vm/vm_reserv.c
319 ↗(On Diff #60828)

done

markj added a comment.Aug 15 2019, 2:53 PM

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

ray marked an inline comment as done.Aug 15 2019, 3:02 PM

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

only if you want to add some really small seg at 0x00000000 :)

markj added a comment.Aug 15 2019, 3:08 PM
In D21272#462545, @ray wrote:

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

only if you want to add some really small seg at 0x00000000 :)

Why does it have to be at 0x00000000? With the segment below we will test 0x100000 <= 0x9d000 - 0x100000, and the variables are unsigned.

SEGMENT 0:  
                                   
start:     0x10000                 
end:       0x9d000
domain:    0                       
free list: 0xffffffff81233cb0
ray added a comment.Aug 15 2019, 3:21 PM
In D21272#462545, @ray wrote:

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

only if you want to add some really small seg at 0x00000000 :)

Why does it have to be at 0x00000000? With the segment below we will test 0x100000 <= 0x9d000 - 0x100000, and the variables are unsigned.

SEGMENT 0:  
start:     0x10000                 
end:       0x9d000
domain:    0                       
free list: 0xffffffff81233cb0

no, start and end will be truncated and rounded up.
So you will get segment 0x00000000 with size 0x00100000.
First iteration will look that:
0x000000 <= 0x100000 - 0x100000, which is true and you will initialize one super page.

kib added a comment.Aug 15 2019, 3:30 PM
In D21272#462558, @ray wrote:
In D21272#462545, @ray wrote:

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

only if you want to add some really small seg at 0x00000000 :)

Why does it have to be at 0x00000000? With the segment below we will test 0x100000 <= 0x9d000 - 0x100000, and the variables are unsigned.

SEGMENT 0:  
start:     0x10000                 
end:       0x9d000
domain:    0                       
free list: 0xffffffff81233cb0

no, start and end will be truncated and rounded up.
So you will get segment 0x00000000 with size 0x00100000.
First iteration will look that:
0x000000 <= 0x100000 - 0x100000, which is true and you will initialize one super page.

It might be best to use the 'minus' method of checking only when paddr + VM_LEVEL_0_SIZE overflows. I.e. check padd + VM_LEVEL_0_SIZE < paddr, it is true when overflow happens. Then use the 'minus' method, otherwise keep the current arithmetic.

markj added a comment.Aug 15 2019, 5:00 PM
In D21272#462558, @ray wrote:
In D21272#462545, @ray wrote:

Don't we then have the same problem if seg->end < VM_LEVEL_0_SIZE?

only if you want to add some really small seg at 0x00000000 :)

Why does it have to be at 0x00000000? With the segment below we will test 0x100000 <= 0x9d000 - 0x100000, and the variables are unsigned.

SEGMENT 0:  
start:     0x10000                 
end:       0x9d000
domain:    0                       
free list: 0xffffffff81233cb0

no, start and end will be truncated and rounded up.

end does not get truncated.

So you will get segment 0x00000000 with size 0x00100000.
First iteration will look that:
0x000000 <= 0x100000 - 0x100000, which is true and you will initialize one super page.

We should not initialize a reservation for the segment I pasted above, since it does not
span an aligned 2MB range of physical memory. I tested the patch and my amd64 test
system panicked early during boot.

In D21272#462560, @kib wrote:

It might be best to use the 'minus' method of checking only when paddr + VM_LEVEL_0_SIZE overflows. I.e. check padd + VM_LEVEL_0_SIZE < paddr, it is true when overflow happens. Then use the 'minus' method, otherwise keep the current arithmetic.

Can't we simply test paddr < rounddown2(seg->end, VM_LEVEL_0_SIZE)?

ray updated this revision to Diff 60850.Aug 15 2019, 6:50 PM

Simplify patch.

ray added a comment.Aug 15 2019, 6:50 PM

What about that version? :)

ray added a comment.Aug 15 2019, 7:13 PM

Or even (intmax_t), for this code not to make problems 10-20 years later :)

kib added a comment.Aug 16 2019, 9:14 AM

Can't we simply test paddr < rounddown2(seg->end, VM_LEVEL_0_SIZE)?

Is seg->end 0xffffffff or 0 for the problematic case ? Anyway, I prefer the explicit check for overflow, it makes the things absolutely clear for reader.

uint64_t cast is no-go of course.

ray added a comment.EditedAug 16 2019, 10:11 AM
In D21272#462874, @kib wrote:

Can't we simply test paddr < rounddown2(seg->end, VM_LEVEL_0_SIZE)?

Is seg->end 0xffffffff or 0 for the problematic case ?

No, it is 0xfff00000. But problem with paddr, which is 32bits and paddr + 0x100000 will be 0x00000000.
It satisfy check:
0xffe00000 <= 0xfff00000 - true
0xfff00000 <= 0xfff00000 - true
0x00000000 <= 0xfff00000 - true

BTW, I set MTX_NEW to mtx_init for test, and it just go around.

Anyway, I prefer the explicit check for overflow, it makes the things absolutely clear for reader.
uint64_t cast is no-go of course.

cast to uint64_t will do nothing on 64bits machines and will have impact on 32bits machines only.
cast to intmax_t will add "double-math" for all, but save code from modifications when 64->128 era begin :)
and since it is affect only boot process, think it acceptable.

IMO, Extra check will be workaround, but possible too. Like:

if (paddr + VM_LEVEL_0_SIZE == 0)
        break;
kib added a comment.Aug 16 2019, 10:33 AM
In D21272#462891, @ray wrote:
In D21272#462874, @kib wrote:

Can't we simply test paddr < rounddown2(seg->end, VM_LEVEL_0_SIZE)?

Is seg->end 0xffffffff or 0 for the problematic case ?

No, it is 0xfff00000. But problem with paddr, which is 32bits and paddr + 0x100000 will be 0x00000000.
It satisfy check:
0xffe00000 <= 0xfff00000 - true
0xfff00000 <= 0xfff00000 - true
0x00000000 <= 0xfff00000 - true
BTW, I set MTX_NEW to mtx_init for test, and it just go around.

Anyway, I prefer the explicit check for overflow, it makes the things absolutely clear for reader.
uint64_t cast is no-go of course.

cast to uint64_t will do nothing on 64bits machines and will have impact on 32bits machines only.
cast to intmax_t will add "double-math" for all, but save code from modifications when 64->128 era begin :)
and since it is affect only boot process, think it acceptable.

vM is coded using (somewhat) abstracted types. We do not assume that everything is ILP32 or LP64, at least not in places where it is easy to avoid and which would create one more portability bug for future system porting.

IMO, Extra check will be workaround, but possible too. Like:

if (paddr + VM_LEVEL_0_SIZE == 0)
        break;

Canonical way to check for overflow is

if (paddr + VM_LEVEL_0_SIZE < paddr)

It is also more correct in fact, if paddr is not superpage-aligned.

ray updated this revision to Diff 60882.Aug 16 2019, 11:24 AM

Check paddr for overflow.

ray updated this revision to Diff 60883.Aug 16 2019, 11:38 AM

Check paddr for overflow.

markj accepted this revision.Aug 16 2019, 2:26 PM
In D21272#462874, @kib wrote:

Can't we simply test paddr < rounddown2(seg->end, VM_LEVEL_0_SIZE)?

Is seg->end 0xffffffff or 0 for the problematic case ? Anyway, I prefer the explicit check for overflow, it makes the things absolutely clear for reader.

I think having a seg with end 0 would break some invariants. For instance, _vm_phys_create_seg() assumes that segments are sorted by their end address. seg->start and end must be page-aligned, so this means that we cannot include the last page of physical memory in a segment.

uint64_t cast is no-go of course.

A uint64_t cast would also only fix the problem for 32-bit platforms.

This revision is now accepted and ready to land.Aug 16 2019, 2:26 PM
kib added a comment.Aug 16 2019, 2:30 PM

I noted on IRC that roundup2 on line 1058 might result in overflow as well, at least theoretically.

alc accepted this revision.Aug 16 2019, 3:27 PM
kib accepted this revision.Aug 16 2019, 6:48 PM
This revision was automatically updated to reflect the committed changes.