Page MenuHomeFreeBSD

Add retry loop around GetMemoryMap call to fix fragmentation bug
ClosedPublic

Authored by bcran on Feb 25 2019, 6:57 AM.

Details

Summary

The call to BS->AllocatePages can cause the memory map to become framented,
causing BS->GetMemoryMap to return EFI_BUFFER_TOO_SMALL more than once. For
example this can happen on the MinnowBoard Turbot, causing the boot to stop
with an error. Avoid this by calling GetMemoryMap in a loop.

Test Plan

Tested on MinnowBoard Turbot. Caused the existing code to halt boot with the
GetMemoryMap error 5, and re-tested with this patch and verified the boot
continues.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

bcran created this revision.Feb 25 2019, 6:57 AM
bcran added a comment.Feb 25 2019, 7:14 AM

The odd thing is that the comment at the top does say "Note that the AllocatePages call can itself modify the memory map", but then if the second call to GetMemoryMap fails the function bails out instead of retrying. I'm wondering if that was a mistake, and if the call to ExitBootServices can _also_ fail or if we can remove the outer retry loop now the inner once is retrying GetMemoryMap?

The outer loop should be able to go away. IIRC the only non-fatal failure of ExitBootServices at this point is due to not having the latest memory map. If we've already done the allocate -> getmemorymap dance, there will be no failure we can correct.

tsoome added inline comments.Feb 26 2019, 8:36 AM
stand/efi/loader/bootinfo.c
352 ↗(On Diff #54333)

You can make it simpler - this could be endless loop (while (true)), and we:

  1. return if GetMemoryMap() will return error other than EFI_BUFFER_TOO_SMALL
  2. break if GetMemoryMap() had no error
  3. return if AllocatePages() will fail.
354 ↗(On Diff #54333)

If you move if (!EFI_ERROR(status)) check before this line, we can just have if (status != EFI_BUFFER_TOO_SMALL)

363 ↗(On Diff #54333)

Since we did check this already, this if is not needed.

367 ↗(On Diff #54333)

This increment is too much, I think (the memory map can be quite large in systems with a lot of memory).

I would suggest to go with smaller steps, like page at a time, also at next line you can use ROUNDUP(sz, EFI_PAGE_SIZE).

bcran updated this revision to Diff 54449.Feb 27 2019, 6:51 AM

Several changes to improve the code, and apply the same fix to loader/copy.c

bcran added inline comments.Feb 27 2019, 6:53 AM
stand/efi/loader/copy.c
181 ↗(On Diff #54449)

I added a NULL check because BS->FreePages() fails (asserts) when passed a NULL pointer.

bcran added inline comments.Feb 27 2019, 6:55 AM
stand/efi/loader/bootinfo.c
352 ↗(On Diff #54333)

Hmm, it probably _is_ safe to make it an infinite loop since I don't see a way it could end up looping forever.

bcran added inline comments.Feb 27 2019, 6:59 AM
stand/efi/loader/bootinfo.c
363 ↗(On Diff #54333)

Good point - I'll fix it.

367 ↗(On Diff #54333)

For systems with a large amount of memory, I'd expect still a relatively small memory map with one entry that covers most of the RAM. It'll be larger on systems like HPE that have lots of firmware utilities that will cause a lot of fragmentation.

In my latest patch I've changed it to add 10 memory descriptors to the size, which I think should be sufficient.

tsoome added inline comments.Feb 27 2019, 7:04 AM
stand/efi/loader/copy.c
181 ↗(On Diff #54449)

that check should be inside the free(). And this will lead us to another question - boot1 does use pool based allocation, and loader itself does use preallocated heap (see efi_main.c). Where we get to FreePages()?

bcran updated this revision to Diff 54450.Feb 27 2019, 7:07 AM

Changes based on review comments by @tsoome

bcran added inline comments.Feb 27 2019, 7:24 AM
stand/efi/loader/copy.c
181 ↗(On Diff #54449)

Oh, Free _does_ check for a NULL pointer before calling FreePages. However, it does seem that for some reason boot1 isn't using the preallocated heap?

void *
Malloc(size_t len, const char *file __unused, int line __unused)
{
	void *out;

	if (BS->AllocatePool(EfiLoaderData, len, &out) == EFI_SUCCESS)
		return (out);

	return (NULL);
}

void
Free(void *buf, const char *file __unused, int line __unused)
{
	if (buf != NULL)
		(void)BS->FreePool(buf);
}
bcran updated this revision to Diff 54451.Feb 27 2019, 7:26 AM

Remove NULL pointer check before calling free().

bcran added inline comments.Feb 27 2019, 7:31 AM
stand/efi/loader/copy.c
181 ↗(On Diff #54449)

before calling FreePages

Of course that should read FreePool.

tsoome added inline comments.Feb 27 2019, 8:02 AM
stand/efi/loader/copy.c
181 ↗(On Diff #54449)

No, boot1 does not use preallocated heap; We could make it to, but we rather should remove boot1 instead:)

From practical point of view, the size of preallocated heap would be an issue, because boot1 will stay in memory while loader.efi is running, that memory is wasted and will bite the systems with small memory - the loader itself is using 64MB for heap, another 64MB for kernel staging area + the loader binary itself. then add boot1...

imp added inline comments.Feb 27 2019, 2:32 PM
stand/efi/loader/copy.c
119 ↗(On Diff #54451)

You don't need this check.

kevans added inline comments.Feb 27 2019, 2:32 PM
stand/efi/loader/bootinfo.c
391 ↗(On Diff #54451)

mjg (the other mjg, not our mjg) noted that ExitBootServices can modify the memory map then subsequently fail, forcing you to replay the GetMemoryMap dance to actually exit -- but the subsequent retry will succeed, so I was clearly wrong about dropping the outer loop (and now that I re-read the comment, whoops...).

bcran added inline comments.Feb 27 2019, 11:58 PM
stand/efi/loader/bootinfo.c
391 ↗(On Diff #54451)

Do you have a link to the note/discussion, or more information? I haven't read anything in the spec about ExitBootServices modifying the memory map. Is it a bug in some implementations?

bcran added inline comments.Feb 28 2019, 12:14 AM
stand/efi/loader/copy.c
181 ↗(On Diff #54449)

Ah, that's right, loader uses functions in stand/libsa/zalloc_malloc.c. I suspect the tons of calls to the UEFI allocation functions while the loader is running are coming from the system firmware for printing strings etc.

I hope to garbage collect efi/boot1 once I've updated installworld to update the ESP.

bcran marked 7 inline comments as done.Mar 2 2019, 3:46 AM
bcran updated this revision to Diff 54604.Mar 2 2019, 3:53 AM
  • Added a retry loop around ExitBootServices, and addressed other comments on the review.
bcran updated this revision to Diff 54605.Mar 2 2019, 3:55 AM
  • Removed unused ebsretry and retry variables
bcran added inline comments.Mar 2 2019, 3:56 AM
stand/efi/loader/bootinfo.c
367 ↗(On Diff #54333)

@tsoome Could you confirm whether the latest patch addresses your comment, please? I'm not entirely sure what increment you were referring to.

kevans added inline comments.Mar 5 2019, 4:19 AM
stand/efi/loader/bootinfo.c
367 ↗(On Diff #54333)

I think the comment was based on slight misreading; at the time of writing, it was being increased by mmsz (whose name would suggest that it might be the Memory Map SiZe, rather than the descriptor size that it is). Allocating for 10 more descriptors seems reasonable here, I think.

368 ↗(On Diff #54605)

This can probably just go away, as discussed on IRC. The conversion to pages will take care of any necessary alignment, then we'll be clobbering sz here shortly if we don't error out anyways.

bcran updated this revision to Diff 54722.Mar 5 2019, 4:36 AM
  • Remove alignment of sz and rename mmsz to dsz (to match copy.c).
  • Update copy.c to use dsz instead of sizeof(EFI_MEMORY_DESCRIPTOR).
bcran marked an inline comment as done.Mar 5 2019, 4:38 AM
bcran marked 2 inline comments as done.Mar 5 2019, 5:38 AM
imp accepted this revision.Mar 6 2019, 4:22 AM

generally this looks good.
I'm unsure about some of the times you call printf, but it can't hurt, I guess and might help.
minor style issues that I flagged.

stand/efi/loader/bootinfo.c
345 ↗(On Diff #54722)

/*

  • Matthew...

is the style.

357 ↗(On Diff #54722)

can we call printf if we can't get memory?

372 ↗(On Diff #54722)

ditto

394 ↗(On Diff #54722)

I'd be tempted to do this BEFORE the printf call.

This revision is now accepted and ready to land.Mar 6 2019, 4:22 AM
bcran marked 4 inline comments as done.Mar 6 2019, 5:09 AM
bcran added inline comments.
stand/efi/loader/bootinfo.c
345 ↗(On Diff #54722)

Fixed.

357 ↗(On Diff #54722)

Yes, because printf etc. use a preallocated 64MB heap.

372 ↗(On Diff #54722)

We use a block of memory that was already allocated, so no problems calling printf here.

394 ↗(On Diff #54722)

Changed.

tsoome accepted this revision.Mar 6 2019, 5:35 AM
bcran marked 4 inline comments as done.Mar 6 2019, 5:39 AM
This revision was automatically updated to reflect the committed changes.