Page MenuHomeFreeBSD

Bhyve: fix SMBIOS Type 17 table generation
ClosedPublic

Authored by bcran on Mar 18 2020, 2:13 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Nov 16, 5:43 PM
Unknown Object (File)
Tue, Nov 12, 6:52 PM
Unknown Object (File)
Sun, Nov 10, 2:20 PM
Unknown Object (File)
Sun, Nov 10, 12:28 AM
Unknown Object (File)
Sat, Nov 9, 8:09 PM
Unknown Object (File)
Sat, Nov 9, 7:36 PM
Unknown Object (File)
Oct 2 2024, 1:35 PM
Unknown Object (File)
Sep 30 2024, 2:54 PM

Details

Summary

Bhyve: fix SMBIOS Type 17 table generation

According to the SMBIOS specification (revision 2.7 or newer), the
extended module size field should only be used for sizes that can't
fit in the older size field.

Test Plan

Tested with FreeBSD: it now detects the correct amount of memory.
Also tested with Linux and Windows (Server 2008, 2012, 2016 and 2019).

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Remove unrelated changes.

bcran retitled this revision from Bhyve: log message when rfb client connects to Bhyve: fix SMBIOS Type 17 table generation.Mar 18 2020, 2:14 AM
bcran edited the summary of this revision. (Show Details)
grehan added a subscriber: grehan.

Looks good.

You might want to see what happens with > 64GB (or maybe 128GB, can't recall the value). At some point additional tables are required for Windows.

This revision is now accepted and ready to land.Mar 18 2020, 3:36 AM

You might want to see what happens with > 64GB (or maybe 128GB, can't recall the value). At some point additional tables are required for Windows.

That sounds like something for a different patch, but I'll certainly plan to test that: the biggest machine I have at home has 128GB RAM, but I see https://www.packet.com/ have machines with up to 384GB I can use.

You might want to see what happens with > 64GB (or maybe 128GB, can't recall the value). At some point additional tables are required for Windows.

Ah, it's 32GB that's the limit: Windows Server 2008 R2 doesn't recognize more than that, I'm presuming because it doesn't know about the xsize field that was introduced around 2009 to support larger DIMMs.
I plan to work on a change that splits RAM up into 32GB sized DIMMs to work around that. At some point we might want to introduce an OS or SMBIOS version flag to control features like that.

In looking at the SMBios 2.7 change list in DSP01342.7.1.pdf (final version of 2.7) on page 134 There is a memory device (type 17) update that you also need to implement to allow DIMM's >32GB. In my work on VM_MAXCPU's I found it would actually be desirable at some point to get to 3.X version of some form, can not recall exactly what the minimum was right now, but reading a newer change log I see some interesting things like in 3.0.0 add GUID value for discovering SMBIOS tables in UEFI, which iirc would help our UEFI situation, add new entry for extended BIOS ROM size, oh here it is 3.0.0 Type 4, extned core, core eabled and thread count ranges. Looking at the change list in the 3.3.0 spec between 2.6 and 3.0.0 is not a huge task as much of it does not apply to bhyve.

But I think to bump so 27 you must implement the device type 17 change with this one, which fixes your current issue any way.

usr.sbin/bhyve/smbiostbl.c
766 ↗(On Diff #69619)

We need to be very careful about just bumping this because we added *some* feature from a new version, we must be sure the whole of the SMBios implementation in bhyve is conformat with that version. Luckly the difference between 2.6 and 2.7 is minor, but before this bumps we should look at all the changes to make sure they are implemented.

Argg.. you did at the new field. Hum, I am reading 3.3.0 type 17 now.

rgrimes requested changes to this revision.Mar 20 2020, 7:12 PM

This code should work without a need to split memory up into 32GB Dimms

usr.sbin/bhyve/smbiostbl.c
258 ↗(On Diff #69619)

Comment is miss leading, this is size in kilobytes or megabytes depending on MSB.

708 ↗(On Diff #69619)

This calculation seems odd, I see later it is overwritten for large memory, cant we just continue to use extended all the time and leave this set at 0x7fff? Never mind, spec says not. Ok how about doing:
if (memsize <= (32GB-1MB)) {

  type17->size = size_MB; } else {
 type17->size = 0x7FFF;
 if (guest_himem <= 0 {
  #error should never happen
 } else {
fall into the code below, remove the if, we just did that with an error check

This would make it follow how the spec is written

716 ↗(On Diff #69619)

Should this be >= to, as if it is exactly 0x7fff, aka 32GB-1mb you'll end up with a memory size of 0.

718 ↗(On Diff #69619)

There is no check for truncation here

This revision now requires changes to proceed.Mar 20 2020, 7:12 PM

add GUID value for discovering SMBIOS tables in UEFI, which iirc would help our UEFI situation

bhyve/UEFI has always searched for the SMBIOS table in memory and used that.

jhb added inline comments.
usr.sbin/bhyve/smbiostbl.c
708 ↗(On Diff #69619)

It is not overwritten as we allocate a separate type 17 entry for guest_himem, however, it does seem odd that we set the entry size for lomem to the full size instead of just the lomem size as we did before. That is, it seems we now create two entries for the same logical "DIMM". One holds the truncated size and one holds the extended size if we truncated, otherwise it holds zero, but we only create this second DIMM if guest_himem is > 0? I would expect us to do 1 of two things:

  1. 1 DIMM for both guest_lomem + guest_himem. (This closer to real hardware in practice)
  2. 1 DIMM for guest_lomem and 1 DIMM for guest_himem (what the older code does).
766 ↗(On Diff #69619)

If anything, it seems to be the other way around. We already define 2.7 fields in other types already (e.g. type 3 and types 16+19 for which we hardcode 2.7+ only encodings). I suspect we chose a conservative version number here to appease older guests previously.

This calculation seems odd, I see later it is overwritten for large memory, cant we just continue to use extended all the time and leave this set at 0x7fff? Never mind, spec says not. Ok how about doing:

Sorry, I only just saw the more recent comments. The problem is that older operating systems don't know to look at the extended field, and only ever look at the field which has a maximum value of 32GB.

usr.sbin/bhyve/smbiostbl.c
766 ↗(On Diff #69619)

Ah, okay. That makes sense actually: we can define fields that are only defined by a newer version of the spec because systems which only know about the version we specify won't know to look at the newer fields, and shouldn't get confused.

usr.sbin/bhyve/smbiostbl.c
718 ↗(On Diff #69619)

Thanks, I'll correct those issues.

usr.sbin/bhyve/smbiostbl.c
766 ↗(On Diff #69619)

From reading the spec, it looks like bcdrev should match the major/minor fields. Do you think I should bump bcdrev to 0x26 here, or leave it as it is?

usr.sbin/bhyve/smbiostbl.c
766 ↗(On Diff #69619)

From digging in svn blame it seems in r272007 Peter Grehan bumped smbios->minor without bumping the BCD value. I would suggest we start another review to fix, and eliminate the possibility of this happening again in the future with:

smbios_ep->bcdrev = ((smbios_ep->major & 0xF) << 4) | smbios_ep->minor & 0xF));

I think we should go with 2.7 btw. Systems shouldn't read 2.7 fields if we advertise 2.6 since those fields "shouldn't exist". OTOH, the spec does say systems are permitted to use the length of individual structures to determine which optional fields exist.

This calculation seems odd, I see later it is overwritten for large memory, cant we just continue to use extended all the time and leave this set at 0x7fff? Never mind, spec says not. Ok how about doing:

Sorry, I only just saw the more recent comments. The problem is that older operating systems don't know to look at the extended field, and only ever look at the field which has a maximum value of 32GB.

That is what the 0x7fff value is for when using extended btw, is to report the max possible size for legacy systems. I agree that we should use the non-extended size when it fits.

In re-reading the descriptions of types 17 and 19, I think we probably should only do 1 type 17 entry that is lomem + himem. We already DTRT for having separate type 19 entries for lomem and himem. This is similar to an i386 system with 4GB of RAM in one DIMM. You would have two type 19 entries, 1 for memory from 0 to the memio hole (3.5 GB or so) and one for the rest of RAM starting at 4GB, but you would have one type 17 entry for a 4GB DIMM. That would mean removing the whole 'guest_himem' case from type 17's initializer and using conditional logic as Rod suggested to set size or extended size to the total memory size.

In D24107#532391, @jhb wrote:

I think we should go with 2.7 btw. Systems shouldn't read 2.7 fields if we advertise 2.6 since those fields "shouldn't exist". OTOH, the spec does say systems are permitted to use the length of individual structures to determine which optional fields exist.

The seperate review would just correct the code so that bcdrev is derived from major/minor, then this review can do the bump to 2.7?

Rebecca is it ok with you if I go create a review to do the bcdrev fix?

In D24107#532391, @jhb wrote:

Rebecca is it ok with you if I go create a review to do the bcdrev fix?

Yes, please do.

In D24107#532393, @jhb wrote:

That is what the 0x7fff value is for when using extended btw, is to report the max possible size for legacy systems. I agree that we should use the non-extended size when it fits.

I just checked and @rgrimes is right: when the size field is set to 0x7FFF, dmidecode reports the RAM as "Size: 0 TB" - so we need to use the extended size field for anything equal to or greater than that.
Also, the spec does say this about using the size/extended_size fields:

"For compatibility with older SMBIOS parsers, memory devices smaller than (32 GB - 1 MB) 1399 should be represented using their size in the Size field, leaving the Extended Size field set to 0."

So we _should_ use the non-extended size when it fits.

  • Fixes based on review feedback.

    o Don't create two type 17 entries. o Use the extended size field for 0x7FFF and greater.
  • Revert bumping the SMBIOS version.
usr.sbin/bhyve/smbiostbl.c
718 ↗(On Diff #69619)

Sorry, do you mean it should check if the guest has less than 32MB, and if so store the size with the high bit set?

usr.sbin/bhyve/smbiostbl.c
718 ↗(On Diff #69619)

Well phab has moved this to point at the wrong line, but no what I meant was you do a & 0x7fff or 0x7fffffff on variables that may lead to you truncated a value with no error or warning, which may lead to unexpected results.

Though I believe the &0x7fff is now fixed due to the logic changes, the &0x7ffffff could lead to a truncation and should probably just be a if > 0x7ffffff return error;

  • Rework type17 generation and error out on overflow.
  • Rework type17 generation and error out on overflow.

There is no need to error out on overflow here. The type17 isn't used by most o/s's for determining how much memory is available: there is e820, the EFI memory map, etc etc. Perhaps print an error in bhyve, buit there doesn't seem any point in erroring out.

There is no need to error out on overflow here. The type17 isn't used by most o/s's for determining how much memory is available: there is e820, the EFI memory map, etc etc. Perhaps print an error in bhyve, buit there doesn't seem any point in erroring out.

Okay. Also, I don't think we're going to have guests configured with ~2PB memory any time in the near future :)

  • Change overflow error to a warning and truncate type 17 memory size.

There is no need to error out on overflow here. The type17 isn't used by most o/s's for determining how much memory is available: there is e820, the EFI memory map, etc etc. Perhaps print an error in bhyve, buit there doesn't seem any point in erroring out.

Okay. Also, I don't think we're going to have guests configured with ~2PB memory any time in the near future :)

Indeed :) And when 5-level paging h/w is available there will be a lot more changes needed than just htis.

Misconfigurations of guest memory that go past the supported host address lines will be caught earlier than this, and in a review I should be posting shortly.

  • Rework type17 generation and error out on overflow.

There is no need to error out on overflow here. The type17 isn't used by most o/s's for determining how much memory is available: there is e820, the EFI memory map, etc etc. Perhaps print an error in bhyve, buit there doesn't seem any point in erroring out.

I disagree, if infact for some reason that value has gotten >2PB something has gone wrong and what your going to end up with is certainly not what the user wanted.

usr.sbin/bhyve/smbiostbl.c
766 ↗(On Diff #69619)

Why are these 2 lines showing up as what look like reverse diffs now?

bcran added inline comments.
usr.sbin/bhyve/smbiostbl.c
766 ↗(On Diff #69619)

Oops, I gave the wrong revision to "arc patch" - "HEAD~3" instead of "HEAD~5" so it didn't have the full patch. Should be fixed now, sorry.

jhb added inline comments.
usr.sbin/bhyve/smbiostbl.c
710 ↗(On Diff #69979)

Please stick to C style comments to be consistent with the rest of the code.

712 ↗(On Diff #69979)

I concur with only a warning and not error'ing here. DMI/SMBIOS tables are generally only used for cosmetics (FreeBSD uses it to be more tidy for dmesg, but all the "real" stuff uses SMAP / EFI memory map).

@rgrimes Could you approve the latest patch, or provide comments on what should be changed please?

  • Change comments to C style.
bcran added inline comments.
usr.sbin/bhyve/smbiostbl.c
710 ↗(On Diff #69979)

Good point. Fixed.

bcran edited the summary of this revision. (Show Details)

I am happy with the other fixes, not sure if there is a need to fix or error check memory size bit overflows

usr.sbin/bhyve/smbiostbl.c
699 ↗(On Diff #70019)

I just noticed this, its possible for these types to be overflowed as uint64_t/1024 can be 54 bits long and uint64_t/MB can be 44 bits.

This revision is now accepted and ready to land.Mar 30 2020, 10:05 PM
bcran added inline comments.
usr.sbin/bhyve/smbiostbl.c
699 ↗(On Diff #70019)

I'll change them back to uint64_t before committing.

This revision was automatically updated to reflect the committed changes.
bcran marked an inline comment as done.