ofw_bus_node_is_compatible(OF_finddevice("/"), "fsl,lx2160a")
does the trick (assuming that OF_finddevice("/") cannot fail)...
ofw_bus_node_is_compatible(OF_finddevice("/"), "fsl,lx2160a")
does the trick (assuming that OF_finddevice("/") cannot fail)...
This breaks mmc on FDT based Honeycomb with following log:
I see. But why you cannot use something like this (tested only on Honecomb) https://github.com/strejda/freebsd/commit/40eb737b2e9ca485daef8bf2effa96b053f847f9 ? Advantage of this code is that it can stay unchanged after thermal-zone driver was introduced..
I see. But why can't you use something like this (tested only on Honeycomb) https://github.com/strejda/freebsd/commit/40eb737b2e9ca485daef8bf2effa96b053f847f9 ? The advantage of this code is that it can remain unchanged after the thermal zone driver has been introduced.
LGTM
Sorry, I didn't realize we were talking about a different SoC.
For me, the only relevant source on the number of sensors is TRM, not the DT files.
The current code is ready for variable number of sensors, just fill a new struct tsensor array according to TRM and use that instead of default_sensors. Unfortunately, due to a strange habit in QorIQ DT we have to use compatible string of the root node itself instead of the compatible string of the thermal controller :( . See comment on (original) line 311.
Where do you see the problem?
I'm sorry, but I don't agree with this.
The temperature sensor is a different entity from a temperature zone which we can not mix. A given SOC always has a fixed number of temperature sensors but the number of temperature zones is determined by the design of the board/laptop/equipment. In addition, one temperature zone can be controlled by multiple temperature sensors (where each can have a different weight) and can control multiple cooling devices. A sensor may or may not be a member of a thermal zone and yet its value may be useful to the user (and thus accessible, for example, by sysctl(8)).
I have an initial implementation (almost ready to commit) of a temperature sensor framework and a very early implementation of thermal zones - both necessary for temperature controlled cooling. The problem is that I don't have the free time to finish them right now... So take what you want, or give me time over Christmas, I'll try to finish it to a committable state.
Thanks
on ARM (and on other intrng enabled systems) the interrupt resource does not represent the real interrupt - it is an arbitrarily assigned index that points to opaque interrupt mapping data collected by ofwbus - so multiple different interrupt resources can identify one actual interrupt or similar.
I think adding more irregularity to an already ill-defined code behavior is not a good idea. Moreover, I think this is an obvious driver error - requesting a single segment for a buffer that can bounce is clear nonsense in which case the driver should allocate the buffer using bus_dmamem_alloc() or possibly pass the buffer aligned. Copying multiple pages back and forth doesn't sound like an optimal solution.
This looks perfect to me, many thanks.
In final commit, the original arm_kernel_boothdr.awk should be deleted. Otherwise look good to me
Otherwise looks good for me.,
Tested on MACCHIATObin and HoneyComb LX2.
Please, ignore my previous comment - I overlooked register name :(
Tested on MACCHIATObin and HoneyComb LX2 .
imho, only fan53555. And thanks for help.
If you can, do it yourself. My build environment is currently broken. I'm recovering from ZFS metadata corruption, plus ipfw (libalias) wants to divide by zero from time to time... :( Thanks.
We should switch to pmic/fan53555 driver which already supports it (along with other variants). I seem to have forgotten to do that. Can you test this, please?
I need day or two to test this on my boards, but it looks OK for me.
On correctly implemented systems, reading from an unimplemented device register will cause an exception (external asynchronous interrupt on arm64), on real systems it will return an arbitrary value.
Therefore, pci_dw_detect_atu_unroll() should only be called for modern core versions (which have the DW_IATU_VIEWPORT register implemented) - so we should check the core version first.
Otherwise looks good to me.
For clarification, I meant this:
However, the ELF definition also requires that the address of the undefined weak thread local symbol be translated to NULL. Additionally, I have seen code that relied on this. Unfortunately, because of aarch64's "unique" way of accessing TLS variables, I haven't found a way to implement this.
Wouldn't it be better to narrow down the cases where a message is printed, rather than allowing invalid and unreported behavior?
nathan confirmed that ofw_bus_gen_get_node() should return -1 for non-ofw based device. i will commit fix asap (tomorrow).
I'm working on this - but situation about return value becomes more clear now. Please see comment in ofw_if.m https://cgit.freebsd.org/src/tree/sys/dev/ofw/ofw_bus_if.m#n148 and andrew just pointed me to this commit https://cgit.freebsd.org/src/commit/?h=0d8d9edaaaca1
In D30761#692611, @mindal_semihalf.com wrote:In D30761#692573, @mmel wrote:First, please don't take this as hating - I just think we've opened a Pandora's box full of mistakes... I've gotten caught up in the 0/-1 ambiguity for invalid phandle more than once, so I think it would be good to get that sorted out.
No worries, I agree that we should use only a single value to indicate a "NULL"/incorrent phandle.
Since it's a uint32_t I'd personally go with "0".
I understand but 0 is (probably only theoretically) valid pnode.
In D30761#692212, @mindal_semihalf.com wrote:In D30761#692182, @mmel wrote:Imho, this is just papering over real problem, it is obvious that ofw_bus_lookup_imap() should not be called for a device/bus that is not based on ofw.
ofw_bus_lookup_imap is called regardless of the ofw_pci patch.
pci_dw is a class 1 driver, which inherits devmethods from ofw_pcib. The latter implements pcib_route_interrupt using ofw_pcib_route_interrupt.
In other words ofw_bus_lookup_imap is called because the RC is ofw based.Thats true, but something else is wrong in pcib_route_interrupt call-down hiearchy. This need slightly more time and deeper investigation...
I'm still trying to fully understand the real problem, but in the meantime I have a few questions:
- Why ofw_bus_gen_get_node() returns a different value for the errored case than ofw_bus_default_get_node() ?
Frankly I have no idea. I haven't written any of those. I guess we wanted to differentiate between a case where there is no ofw support on the bus(-1) vs. where no node was found for a given device. (0)
Since phandle_t is an uint32_t type, assigning -1 to it is a bad idea imho.After looking deeper into the dev/ofw source code, I noticed that the code only uses compare to -1 for error detection. So, I think ofw_bus_gen_get_node() is bad and should return -1 in case of error.
Well as mentioned above we should definitely return just one error value.
The problem is that changing the return value of ofw_bus_gen_get_node() could potentially break something else that relies on it returning 0.
That's true, but we have time to next release to fix all protentional problems.
First, please don't take this as hating - I just think we've opened a Pandora's box full of mistakes... I've gotten caught up in the 0/-1 ambiguity for invalid phandle more than once, so I think it would be good to get that sorted out.
In D30761#692212, @mindal_semihalf.com wrote:In D30761#692182, @mmel wrote:Imho, this is just papering over real problem, it is obvious that ofw_bus_lookup_imap() should not be called for a device/bus that is not based on ofw.
ofw_bus_lookup_imap is called regardless of the ofw_pci patch.
pci_dw is a class 1 driver, which inherits devmethods from ofw_pcib. The latter implements pcib_route_interrupt using ofw_pcib_route_interrupt.
In other words ofw_bus_lookup_imap is called because the RC is ofw based.
Thats true, but something else is wrong in pcib_route_interrupt call-down hiearchy. This need slightly more time and deeper investigation...
Imho, this is just papering over real problem, it is obvious that ofw_bus_lookup_imap() should not be called for a device/bus that is not based on ofw.
I'm still trying to fully understand the real problem, but in the meantime I have a few questions:
yes, this commit is root cause .
Please see attached log:
IMHO, this is wrong change. regulator_status() must report physical state of regulator - if given deice is powered or not. the mmc power sequencer should use similar technique as backlight. But seems like this kind of usage is common - so we can expand regulator framework with uncounted function -> something like regulator_on()/regulator_off()
Unfortunately this broke (probably) all existing FDT enabled boards with enumerated (not FDT loaded) PCI(e) interfaces (in my case both mcbin and tegra). This driver probe() matches all PCI buses, and many things (driver, ofw code) depend on the fact that calling ofw_bus_get_node(dev) on a non-FDT installed device returns zero.
Sorry for long delay. Can you, please, also test "(IDMAC_MAX_SIZE * (IDMAC_DESC_SEGS - 1)) / MMC_SECTOR_SIZE"? Minus one should be sufficient and it passed all my tests.
I don't think this change is necessary (it mitigates another bug). So I prefer to unconditionally initialize both registers in attach.
write is expected (moreover these are WO ->thus not suitable for SYSCON_MODIFY operation).
I have initial part of this fixup. It should allow board to boot, at minimum. Unfortunately , the interrupt handling have still locking problem -> the solution needs change in syscon provided by simple_mfd driver -> so i need more time for this (one or two days).
But it would be nice if you can test this in your environment. It should fix hang-up issue in pcie driver.
What an ignominy...
Noooo, Im stupid :( The root of the problem is trivial -> gpio_write() doesn't modify only the required bit, but the whole 32-bit register. HW init should use new gpio_modify() (implemented with SYSCON_MODIFY_4() not SYSCON_WRITE_4()). Big big sorry for troubles. I'll fix it tomorrow.
Oops. Thanks for fixing my bugs ...
Thanks, perfect. Please let me know if you need help. I think that there will be more and more similar interrupt controllers/systems, so it's important to create a clean and flexible interface.
I'm sorry but I don't like this approach.
The PIC_MAP_INT function is designed to map external data of exact source to given irqsrc. It should not be misused to transfer any additional, platform specific data back to requester. Moreover if these external data (pure synthetic, without any relation to FDT) are passed as INTR_MAP_DATA_FDT to mapping layer.
Imho, we should convert gicp to standard MSI based interrupt controller. The communication between ICU and GICP should be splitted to normal MSI request and standard method for getting MSI mapping should be used (msi_map_msi()).
It's a little hard for me to express all the nuances so I can prepare skeleton for this solution, if you want.
With above objection.
Can you please try the test again with https://cgit.FreeBSD.org/src/commit/?id=ce5a4083de2d79bc44d209c9e355a09ede47346c ? I hope that it fixed also this problem. Thanks.
My original idea was to do as much as possible for armv7. Primarily because it has been working on this platform for a long time. My bad is that I didn't remember that flush-to-zero mode was chosen because the little version of armv7 VFP may need software emulation for rounding to denormal values, so we chose the IEEE 754 incompatible mode.
Ahh, right, I forgot that Neoverse has (from this point of view) cache levels shifted – see slide 5 of https://www.slideshare.net/linaroorg/getting-the-most-out-of-dynamiq-enabling-support-of-dynamiq-sfo17104
So for purpose of OS optimization we can take real L1 + L2 caches as L1 in pre-neoverse meaning, real l3 in DynamIQ Shared Unit (DSU) block as L2 in pre-neoverse, and CMN as L3.
I think we can still handle all the cases using a two-level hierarchy, where NUMA domains as CG_SHARE_L3 groups and clusters as CG_SHARE_L2 groups will be exported. It should work on a server system, on a big.LITTLE (RK3399) and also on a medium SoC (LX2160A, which have 8 dual-core clusters). Do you think so?
I'm not sure if you want to implement this "extension", and I don't want to block you. The code in this review looks fine to me and doesn't block anything, so push it as needed.
I've spent some time digging up the ARM documentation, but unfortunately we don't seem to be able to determine the exact cache topology. But I think we can estimate it with a reasonable degree of accuracy. For rest, I assume that bit [24] bits is set (otherwise the affinity fields are shifted).
I agree with Andrew. We should use mpidr to build a cores topology. Nowadays it's easy, mpidr is stored in pcpu for all enumerated cores.
But there is another problem - cpuid should be taken as arbitrarily chosen value without any connection to cores topology - nobody can guarantee that cores are numbered sequentially within NUMA domain. Also assumption that NUMA domains are always symmetric (have same number of cores) looks too optimistic. Ampere, as well as LX2160, uses multicluster of dual cores with per cluster l2 cache - I think that L2 cache locality should also be included in initial implementation.
By that I mean that I offer help with implementation and testing on FDT based systems (unfortunately, ACPI is out of my scope and also setup).
Kib, this error handling doesn't make sense to me, so I had to miss something important. Can you, please, explain me the context/reason for this strange kind of errors hiding? Moreover EBIG is also allowed in man page.
Problem is that you cannot expect that (and moreover you cannot determine if) kernel can access memory describes by FDT reservation.
FDT reservation node is used for various purposes. It may be used as an advisory (i.e. for dma buffers location), as a shared memory (i.e. for framebuffer) or as hard exclusion area (i.e. for memory used in secure world) – thus inaccessible from kernel. Moreover, in last case, the implementation is allowed to generate (in rare cases) imprecise exception as result of attempt to access this protected (by trustzone hardware) memory
The FDT reservations are typically added dynamically by u-Boot or ATF -> see fdt_add_reserved_memory(). There is some chance that you can use “no-map” attribute to determine if the reservation may be accessible by kernel. For other cases you must be sure that memory is not mapped by kernel with different attributes than in u-boot or firmware -> this is architecturally undefined behavior which leads sooner or later to loss of coherency.
By all this, you cannot blindly take reserved memory as accessible by kernel and exportable by /dev/mem to userspace.
I can only recommend you try the opposite approach – explicitly map ACPI tables to kernel. I assume, that you can determine memory range where are these tables are and also than you can determine memory attributes if these are mapped by other party (ACPI).
LGTM. thanks.
It is a bit questionable whether ctfconvert should generate an error in this case (I don't think so), but calling it is clearly unnecessary in this case.
I afraid that this approach have problem on FDT based systems. FDT typically uses reserved memory regions also for secure portion of base memory (memory used by secure monitor or PSCI). And reserved memory is handled by using EXFLAG_NOALLOC -> https://cgit.freebsd.org/src/tree/sys/arm64/arm64/machdep.c#n1193
And of course this kind of memory cannot be accessed by /dev/mem.
Unfortunately, side effect of this is large reduction of VA space available for mmap -> available range for (unfixed) mmap is only in the interval <start of data segment + MAXDSIZ, end of user VA space>.
Ohh right, my next mistake. Seems I have not lucky day today :(
Thanks for fixing my stupid bugs.
I just committed fix for arm.
Thanks for cooperation.
In D27886#623414, @rlibby wrote:In D27886#623242, @mmel wrote:Good catch. It never occurred to me that we could have a bug in atomics on two architectures at the same time, so I wasn't looking for a problem in my own garden :) I will try to fix arm by myself.
Are you saying 32-bit ARM is also broken? Those are a little weirder, I think it uses sys/sys/_atomic_subword.h.
Yes, it is also broken. ARM uses its own implementation, which does not do modulo for the bit position.
See https://cgit.freebsd.org/src/tree/sys/arm/include/atomic-v6.h#n862.
I will commit proper fix in next hour or so...
Hmm, right. It would be better to read the manual before coding. My bad.
Good catch. It never occurred to me that we could have a bug in atomics on two architectures at the same time, so I wasn't looking for a problem in my own garden :) I will try to fix arm by myself.
Tested on real HW, everything OK.
This is not entirely true, for every SoC we know where DRAM is located and its maximum size. Same is true for all MMIO peripherals.