Page MenuHomeFreeBSD

physmem: add ram0 pseudo-driver
AcceptedPublic

Authored by ehem_freebsd_m5p.com on Oct 6 2021, 8:02 PM.

Details

Reviewers
jhb
gonzo
kevans
andrew
mhorne
imp
Group Reviewers
arm64
Summary

Its purpose is to reserve all I/O space belonging to physical memory
from nexus, preventing it from being handed out to bus_alloc_resource()
callers such as xenpv_alloc_physmem(), which looks for the first
available free range it can get. This mimics the existing pseudo-driver
on x86.

Test Plan

Tested on arm, arm64, riscv.

Relevant output from devinfo -r on my rockpro64:

ram0
      I/O memory addresses:
          0x200000-0x80fffff
          0x8100000-0x8112fff
          0x8113000-0xe8dfffff
          0xe8e00000-0xea317fff
          0xea318000-0xf0f0afff
          0xf0f0b000-0xf0f0efff
          0xf0f0f000-0xf0f0ffff
          0xf0f10000-0xf0f10fff
          0xf0f11000-0xf0f12fff
          0xf0f13000-0xf0f16fff
          0xf0f17000-0xf0f17fff
          0xf0f18000-0xf0f1cfff
          0xf0f1d000-0xf0f1dfff
          0xf0f1e000-0xf0f1efff
          0xf0f1f000-0xf0f20fff
          0xf0f21000-0xf0f21fff
          0xf0f22000-0xf0f22fff
          0xf0f23000-0xf0f24fff
          0xf0f25000-0xf0f25fff
          0xf0f26000-0xf0f26fff
          0xf0f27000-0xf3f3ffff
          0xf3f40000-0xf3f4ffff
          0xf3f50000-0xf5ffffff
          0xf6000000-0xf7fa3fff
          0xf7fa4000-0xf7ffffff

And the matching physmem entries, reported by verbose dmesg:

Physical memory chunk(s):
  0x00200000 - 0x080fffff,   127 MB (  32512 pages)
  0x08113000 - 0xf0f0afff,  3725 MB ( 953848 pages)
  0xf0f0f000 - 0xf0f12fff,     0 MB (      4 pages)
  0xf0f14000 - 0xf0f1bfff,     0 MB (      8 pages)
  0xf0f1d000 - 0xf0f1dfff,     0 MB (      1 pages)
  0xf0f1f000 - 0xf0f20fff,     0 MB (      2 pages)
  0xf0f22000 - 0xf0f22fff,     0 MB (      1 pages)
  0xf0f25000 - 0xf0f25fff,     0 MB (      1 pages)
  0xf0f27000 - 0xf3f3ffff,    48 MB (  12313 pages)
  0xf3f50000 - 0xf5ffffff,    32 MB (   8368 pages)
  0xf7fa4000 - 0xf7ffffff,     0 MB (     92 pages)
Excluded memory regions:
  0x08100000 - 0x08112fff,     0 MB (     19 pages) NoAlloc 
  0xe8e00000 - 0xea317fff,    21 MB (   5400 pages) NoAlloc 
  0xf0f0b000 - 0xf0f0efff,     0 MB (      4 pages) NoAlloc 
  0xf0f10000 - 0xf0f10fff,     0 MB (      1 pages) NoAlloc 
  0xf0f13000 - 0xf0f16fff,     0 MB (      4 pages) NoAlloc 
  0xf0f18000 - 0xf0f1cfff,     0 MB (      5 pages) NoAlloc 
  0xf0f1e000 - 0xf0f1efff,     0 MB (      1 pages) NoAlloc 
  0xf0f21000 - 0xf0f21fff,     0 MB (      1 pages) NoAlloc 
  0xf0f23000 - 0xf0f24fff,     0 MB (      2 pages) NoAlloc 
  0xf0f26000 - 0xf0f26fff,     0 MB (      1 pages) NoAlloc 
  0xf3f40000 - 0xf3f4ffff,     0 MB (     16 pages) NoAlloc 
  0xf6000000 - 0xf7fa3fff,    31 MB (   8100 pages) NoAlloc

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 47793
Build 44680: arc lint + arc unit

Event Timeline

mhorne requested review of this revision.Oct 6 2021, 8:02 PM

Should have test run results for a Xen DomU within a day. An earlier version which had some differences was quite functional at handling this job. The results looked similar to what @mhorne reported.

(for Xen domains this is a crucial missing piece, since otherwise attempting to allocate unused addresses could result in allocating addresses used by physical memory)

Now visible as P519. Works similarly to previous version, handling the crucial job.

mhorne added a subscriber: kevans.

Ping.

Roping in @kevans since he's looked at this file recently. Kyle, since you are planning to switch amd64 to this KPI, should we try to have it use this as well? (rather than the ram pseudo-driver in sys/x86/x86/nexus.c)

During my most recent build I discovered this appears to have been broken. Appears this was by D34691. I think this can be fixed by a #ifdef _KERNEL, but don't count on me matching FreeBSD style. This hasn't yet successfully compiled this as that will take a while.

sys/kern/subr_physmem.c
529
623

Latest build succeeded and the resultant image appeared to function correctly. So appears D34691 was the trigger and my solution works, though I won't say it is a good one. Time to rope in more reviewers? (perhaps @andrew might work?)

This is missing functionality for ARM/ARM64. Some FreeBSD drivers expect this functionality to be present, otherwise they are potentially unstable. Since x86 has this, this is missing functionality for ARM/ARM64. Is there a reviewers in the house?

sys/kern/subr_physmem.c
621–623

Now appear to need to nuke ram_devclass then remove the ram_devclass argument from DRIVER_MODULE().

ehem_freebsd_m5p.com added a reviewer: mhorne.

Stealing D32343 from @mhorne since this feature is rather important for the project I'm working on.

Two fixups. First, D34691 makes the ram device need to be surrounded by #ifdef _KERNEL. Second, due to @jhb's work removing device classes from DRIVER_MODULE(), this needs the same treatment.

There may be a bit of delta from D34691 changing headers too.

sys/kern/subr_physmem.c
601

How would the interact with a device that's pre-programmed at boot to use certain parts of memory that are in the reserved range? If ram0 attaches first, how will it affect that? Right now, for the code I'm working on, we just use the memory that's in the reserved range and don't bother to allocate it with bus_alloc_resource. Is that the right way to use it? If so, what happens if ram0 runs first?

sys/kern/subr_physmem.c
601

I suspect that would be a problem. The device needs to have its range in the device-tree or ACPI, and those subsystems would initially reserve the region for you.

Meanwhile I've got the opposite situation. A device which can use pretty well arbitrary regions for mapping, thus needs the nexus to know what address ranges are completely unused so it can allocate a free range. It can use actual memory ranges just fine, but that causes usable memory to disappear. You may need to commandeer this diff to fix the issue.

sys/kern/subr_physmem.c
601

These ranges are definitely not in ACPI/FDT. When linux kexec's an arm64 kernel, the gic is up and running. There's no way to reprogram it, so we have to re-use the memory that Linux used to get it started. This data is kludged-passed to us in the UEFI tables (I have code that adds the reserved ranges here, and also queries that memory we find programmed into this device are in the reserved ranges not yet committed).

It's possible to add an additional flag to my calls 'really really reserved, don't nobody else use it ever' flag since the memory is added to this in one place. That might be another way to deal with my F@^^ up bug workaround...

sys/kern/subr_physmem.c
601

@imp so, let me make sure I understand. These reserved ranges correspond to real physical memory, and the GIC driver will be taught to locate these ranges, map them into KVA, and use them?

If this is the case, I don't think this change will affect your work, and you are using those reserved ranges correctly.

The ram pseudo driver is just to help with the I/O space bookkeeping. "Real" memory is conceptually a totally different type of resource, and is not managed with the bus_*() APIs, but rather physmem, uma, malloc, etc. It just so happens that real memory, as a whole, consumes some of the resource that is SYS_RES_MEMORY, and we need to capture that somewhere.

Put differently, why would any driver call bus_alloc_resource() for real memory? In the typical case of allocating memory through malloc(9), it does not reserve the underlying physical range(s) of I/O space in this way.

OK. After talking it over, I'm convinced there's no ill effects from this, even for my crazy as-yet out-of-tree driver changes.

sys/kern/subr_physmem.c
601

Yes. The memory is marked as reserved by some funky means that's not yet in the tree, so would wind up being allocated by the ram driver.

So in that case, we'd advise driver writers not to bus_alloc_resource the memory. The memory so marked in the linux tables is memory that can't be used except by the device(s) that are already using it. So it would be up to the device writer to know this, discover this and use this memory. In that case, this driver wouldn't create an issue and it wouldn't interfere with my work, nor would changes be needed to my work. I like that result, and the rule is clear enough to explain. 'malloc' manages DRAM for non-reserved areas, and some architecture / model dependent code knows how to (a) discover the memory and (b) check that it's properly reserved as it expects using means other than bus_alloc_resources.

This revision is now accepted and ready to land.Oct 12 2022, 8:52 PM