Page MenuHomeFreeBSD

Add retry loop around GetMemoryMap call to fix fragmentation bug
ClosedPublic

Authored by bcran on Feb 25 2019, 6:57 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Mar 31, 10:39 PM
Unknown Object (File)
Feb 21 2024, 3:18 AM
Unknown Object (File)
Feb 17 2024, 8:47 PM
Unknown Object (File)
Feb 16 2024, 11:45 AM
Unknown Object (File)
Feb 16 2024, 11:45 AM
Unknown Object (File)
Feb 16 2024, 11:45 AM
Unknown Object (File)
Feb 16 2024, 11:45 AM
Unknown Object (File)
Feb 16 2024, 11:45 AM
Subscribers
None

Details

Summary

The call to BS->AllocatePages can cause the memory map to become framented,
causing BS->GetMemoryMap to return EFI_BUFFER_TOO_SMALL more than once. For
example this can happen on the MinnowBoard Turbot, causing the boot to stop
with an error. Avoid this by calling GetMemoryMap in a loop.

Test Plan

Tested on MinnowBoard Turbot. Caused the existing code to halt boot with the
GetMemoryMap error 5, and re-tested with this patch and verified the boot
continues.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

The odd thing is that the comment at the top does say "Note that the AllocatePages call can itself modify the memory map", but then if the second call to GetMemoryMap fails the function bails out instead of retrying. I'm wondering if that was a mistake, and if the call to ExitBootServices can _also_ fail or if we can remove the outer retry loop now the inner once is retrying GetMemoryMap?

The outer loop should be able to go away. IIRC the only non-fatal failure of ExitBootServices at this point is due to not having the latest memory map. If we've already done the allocate -> getmemorymap dance, there will be no failure we can correct.

stand/efi/loader/bootinfo.c
352 ↗(On Diff #54333)

You can make it simpler - this could be endless loop (while (true)), and we:

  1. return if GetMemoryMap() will return error other than EFI_BUFFER_TOO_SMALL
  2. break if GetMemoryMap() had no error
  3. return if AllocatePages() will fail.
354 ↗(On Diff #54333)

If you move if (!EFI_ERROR(status)) check before this line, we can just have if (status != EFI_BUFFER_TOO_SMALL)

363 ↗(On Diff #54333)

Since we did check this already, this if is not needed.

367 ↗(On Diff #54333)

This increment is too much, I think (the memory map can be quite large in systems with a lot of memory).

I would suggest to go with smaller steps, like page at a time, also at next line you can use ROUNDUP(sz, EFI_PAGE_SIZE).

Several changes to improve the code, and apply the same fix to loader/copy.c

stand/efi/loader/copy.c
181 ↗(On Diff #54449)

I added a NULL check because BS->FreePages() fails (asserts) when passed a NULL pointer.

stand/efi/loader/bootinfo.c
352 ↗(On Diff #54333)

Hmm, it probably _is_ safe to make it an infinite loop since I don't see a way it could end up looping forever.

stand/efi/loader/bootinfo.c
363 ↗(On Diff #54333)

Good point - I'll fix it.

367 ↗(On Diff #54333)

For systems with a large amount of memory, I'd expect still a relatively small memory map with one entry that covers most of the RAM. It'll be larger on systems like HPE that have lots of firmware utilities that will cause a lot of fragmentation.

In my latest patch I've changed it to add 10 memory descriptors to the size, which I think should be sufficient.

stand/efi/loader/copy.c
181 ↗(On Diff #54449)

that check should be inside the free(). And this will lead us to another question - boot1 does use pool based allocation, and loader itself does use preallocated heap (see efi_main.c). Where we get to FreePages()?

Changes based on review comments by @tsoome

stand/efi/loader/copy.c
181 ↗(On Diff #54449)

Oh, Free _does_ check for a NULL pointer before calling FreePages. However, it does seem that for some reason boot1 isn't using the preallocated heap?

void *
Malloc(size_t len, const char *file __unused, int line __unused)
{
	void *out;

	if (BS->AllocatePool(EfiLoaderData, len, &out) == EFI_SUCCESS)
		return (out);

	return (NULL);
}

void
Free(void *buf, const char *file __unused, int line __unused)
{
	if (buf != NULL)
		(void)BS->FreePool(buf);
}

Remove NULL pointer check before calling free().

stand/efi/loader/copy.c
181 ↗(On Diff #54449)

before calling FreePages

Of course that should read FreePool.

stand/efi/loader/copy.c
181 ↗(On Diff #54449)

No, boot1 does not use preallocated heap; We could make it to, but we rather should remove boot1 instead:)

From practical point of view, the size of preallocated heap would be an issue, because boot1 will stay in memory while loader.efi is running, that memory is wasted and will bite the systems with small memory - the loader itself is using 64MB for heap, another 64MB for kernel staging area + the loader binary itself. then add boot1...

stand/efi/loader/copy.c
119 ↗(On Diff #54451)

You don't need this check.

stand/efi/loader/bootinfo.c
391 ↗(On Diff #54451)

mjg (the other mjg, not our mjg) noted that ExitBootServices can modify the memory map then subsequently fail, forcing you to replay the GetMemoryMap dance to actually exit -- but the subsequent retry will succeed, so I was clearly wrong about dropping the outer loop (and now that I re-read the comment, whoops...).

stand/efi/loader/bootinfo.c
391 ↗(On Diff #54451)

Do you have a link to the note/discussion, or more information? I haven't read anything in the spec about ExitBootServices modifying the memory map. Is it a bug in some implementations?

stand/efi/loader/copy.c
181 ↗(On Diff #54449)

Ah, that's right, loader uses functions in stand/libsa/zalloc_malloc.c. I suspect the tons of calls to the UEFI allocation functions while the loader is running are coming from the system firmware for printing strings etc.

I hope to garbage collect efi/boot1 once I've updated installworld to update the ESP.

bcran marked 7 inline comments as done.Mar 2 2019, 3:46 AM
  • Added a retry loop around ExitBootServices, and addressed other comments on the review.
  • Removed unused ebsretry and retry variables
stand/efi/loader/bootinfo.c
367 ↗(On Diff #54333)

@tsoome Could you confirm whether the latest patch addresses your comment, please? I'm not entirely sure what increment you were referring to.

stand/efi/loader/bootinfo.c
368 ↗(On Diff #54605)

This can probably just go away, as discussed on IRC. The conversion to pages will take care of any necessary alignment, then we'll be clobbering sz here shortly if we don't error out anyways.

367 ↗(On Diff #54333)

I think the comment was based on slight misreading; at the time of writing, it was being increased by mmsz (whose name would suggest that it might be the Memory Map SiZe, rather than the descriptor size that it is). Allocating for 10 more descriptors seems reasonable here, I think.

  • Remove alignment of sz and rename mmsz to dsz (to match copy.c).
  • Update copy.c to use dsz instead of sizeof(EFI_MEMORY_DESCRIPTOR).
bcran marked an inline comment as done.Mar 5 2019, 4:38 AM
bcran marked 2 inline comments as done.Mar 5 2019, 5:38 AM

generally this looks good.
I'm unsure about some of the times you call printf, but it can't hurt, I guess and might help.
minor style issues that I flagged.

stand/efi/loader/bootinfo.c
345 ↗(On Diff #54722)

/*

  • Matthew...

is the style.

357 ↗(On Diff #54722)

can we call printf if we can't get memory?

372 ↗(On Diff #54722)

ditto

394 ↗(On Diff #54722)

I'd be tempted to do this BEFORE the printf call.

This revision is now accepted and ready to land.Mar 6 2019, 4:22 AM
bcran marked 4 inline comments as done.Mar 6 2019, 5:09 AM
bcran added inline comments.
stand/efi/loader/bootinfo.c
345 ↗(On Diff #54722)

Fixed.

357 ↗(On Diff #54722)

Yes, because printf etc. use a preallocated 64MB heap.

372 ↗(On Diff #54722)

We use a block of memory that was already allocated, so no problems calling printf here.

394 ↗(On Diff #54722)

Changed.

This revision was automatically updated to reflect the committed changes.
bcran marked 4 inline comments as done.