Page MenuHomeFreeBSD

amdtemp driver update
Needs ReviewPublic

Authored by rozhuk.im-gmail.com on Feb 22 2017, 11:59 PM.

Details

Summary

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194792

Completely rewritten amdtemp driver.
There is no longer binding to the processor ID, all the new processors will be supported without the need to modify the code.

Maybe for older CPUs (FAMILY 0x0f) that it would not work, I have not had a chance to test.

sensor_offset - can be customized for each sensor separately.
In fact there is only one sensor to the processor.

Thermtrip Status - all registers are read-only.
Hardware Thermal Control (HTC) - everything except HtcTmpLmt available for recording.
Reported Temperature Control - everything except CurTmp available for recording.
SB-TSI - most writable.
Register values ​​can be found in the processor documentation.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
mmokhi added a subscriber: mmokhi.Feb 23 2017, 9:46 PM

At first, Thanks for your great work __/|\__
Technically I'm not right person to talk deeply about this.
But for helping to reduce amount of works for next reviewer (and also for you ;D),
Some style(9) points:
use the form:

type
name()
{
}

instead of

type
name() {
}
/usr/src/sys/dev/amdtemp/amdtemp.c
38 ↗(On Diff #25606)

Where is it used?
I think maybe, maybe you can use KASSERT?

allanjude added inline comments.
/usr/src/sys/dev/amdtemp/amdtemp.c
4 ↗(On Diff #25606)

this change looks like it is inadvertent

Style corrections.

rpaulo edited edge metadata.Feb 25 2017, 2:45 AM

This needs a lot of style fixes. Please read the style man page and check other drivers for examples.

sys/dev/amdtemp/amdtemp.c
38

No can do.

99

{ should be in the line above.

788

I understand why people like the inverted comparison, but it's not the current FreeBSD style.

I will fix style a bi later.

/usr/src/sys/dev/amdtemp/amdtemp.c
4 ↗(On Diff #25606)

My code is old. Then I start there was 2009-2011.

38 ↗(On Diff #25606)

This is for disable sysctl static type checks.

sys/dev/amdtemp/amdtemp.c
38

Without this build fail.
I dont know how to fix.
Have you any ideas?

rpaulo added inline comments.Feb 25 2017, 5:59 AM
sys/dev/amdtemp/amdtemp.c
38

What's the build error?

sys/dev/amdtemp/amdtemp.c
38

/usr/src/sys/modules/amdtemp/../../dev/amdtemp/amdtemp.c:1077:7: error: static_assert expression is not an integral constant expression

regs[i].flags, sc,
^~~~~~~~~~~~~~~~~~

/usr/src/sys/sys/sysctl.h:733:13: note: expanded from macro 'SYSCTL_ADD_PROC'

CTASSERT(((access) & CTLTYPE) != 0);                            \
~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~

/usr/src/sys/sys/systm.h:103:36: note: expanded from macro 'CTASSERT'
#define CTASSERT(x) _Static_assert(x, "compile-time assertion failed")

^

/usr/src/sys/modules/amdtemp/../../dev/amdtemp/amdtemp.c:1093:7: error: static_assert expression is not an integral constant expression

regs[i].flags, sc,
^~~~~~~~~~~~~~~~~~

/usr/src/sys/sys/sysctl.h:733:13: note: expanded from macro 'SYSCTL_ADD_PROC'

CTASSERT(((access) & CTLTYPE) != 0);                            \
~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~

/usr/src/sys/sys/systm.h:103:36: note: expanded from macro 'CTASSERT'
#define CTASSERT(x) _Static_assert(x, "compile-time assertion failed")

^

2 errors generated.

  • amdtemp.o ---
  • [amdtemp.o] Error code 1
rpaulo added inline comments.Feb 25 2017, 8:24 PM
sys/dev/amdtemp/amdtemp.c
38

Change the code to pass non-variable access controls to sysctl_add_proc.
Defeating CTASSERT() is not an option.

rozhuk.im-gmail.com edited edge metadata.EditedMar 4 2017, 1:57 AM
rozhuk.im-gmail.com updated this revision to Diff 25963.

Style update.
CTASSERT() removed.

rpaulo added a comment.Mar 4 2017, 3:26 AM

How can we test this on all hardware versions? I don't have AMD hardware. You could ask on the mailing list.

sys/dev/amdtemp/amdtemp.c
61

{ in line above.

98

{ in line above

102

{ in line above. Please fix all instances.

116

No reason for this to be packed, right?
Use a CTASSERT(sizeof(...)) to make sure the size is correct.

700

*Does

814

Fits in line above.

1060

inverted comparison

I have some AMD: E350, 5350, APU 6800, Phenom II X4 955 - test OK.

sys/dev/amdtemp/amdtemp.c
814

?

rpaulo added a comment.Mar 4 2017, 9:50 PM

I have some AMD: E350, 5350, APU 6800, Phenom II X4 955 - test OK.

Yes, but that's not enough. Please ask on the current mailing list if anyone is willing to help you with testing.

sys/dev/amdtemp/amdtemp.c
814

if (bid != 0x07 && bid !=0x09 ...

Use the whole line up to 78 columns.

904

Better way to write all these bit comparisons is: if (sc->flags & AMDTEMP_XXX) {...}.

No need for comparison with zero and removes the extra parentheses.
Please change all of them in the file.

rpaulo edited edge metadata.Mar 6 2017, 4:35 AM
rpaulo accepted this revision.

We can commit this once we get more testing done.

sys/dev/amdtemp/amdtemp.c
830

Too many parentheses.

This revision is now accepted and ready to land.Mar 6 2017, 4:35 AM
cy added a subscriber: cy.Mar 6 2017, 1:37 PM
cy added a comment.Mar 6 2017, 1:50 PM

Will there be man page updates?

rozhuk.im-gmail.com updated this revision to Diff 26074.

Since revision 310051 sysctl_add_oid have +1 param, add macro to handle it.

This revision now requires review to proceed.Mar 8 2017, 5:34 AM
In D9759#204651, @cy wrote:

Will there be man page updates?

This is best what I can wrote for humans: http://netlab.dhis.org/wiki/index?id=en:software:freebsd:amdtemp

truckman added a subscriber: truckman.EditedApr 4 2017, 6:22 AM

On

CPU: AMD Ryzen 7 1700X Eight-Core Processor          (3393.69-MHz K8-class CPU)

I get the following results:

sysctl dev.amdtemp

dev.amdtemp.0.rtc.sensor_offset: 0
dev.amdtemp.0.rtc.PerStepTimeUp: 0
dev.amdtemp.0.rtc.PerStepTimeDn: 0
dev.amdtemp.0.rtc.TmpMaxDiffUp: 0
dev.amdtemp.0.rtc.TmpSlewDnEn: 0
dev.amdtemp.0.rtc.CurTmpTjSel: -49.0C
dev.amdtemp.0.rtc.CurTmp: 0.1C
dev.amdtemp.0.%parent: hostb10
dev.amdtemp.0.%pnpinfo:
dev.amdtemp.0.%location:
dev.amdtemp.0.%driver: amdtemp
dev.amdtemp.0.%desc: AMD CPU On-Die Thermal Sensors
dev.amdtemp.%parent:

According to the BIOS on this machine, the CPU idles around 56C, which probably includes the +20C that AMD adds to Tctl for the 1700X and 1800X.

There are a lot more bits set in amd_pminfo 0x6599 vs. 0x7d9 on my FX-8320E. Looks like we need some more documentation ...

Update: None of the data reported above varies with CPU load though I can hear the CPU fan speed ramp up and down.

sys/dev/amdtemp/amdtemp.c
693

On my Ryzen 1700X this test fails because CPUID_TO_FAMILY(cpuid) returns zero, causing the driver to fail to attach. If I change this to:

if (CPUID_TO_FAMILY(cpu_id) < 0x0f)

then the driver attaches.

I m waiting for documentation from AMD to Ryzen CPU, looks like 17h have new interfaces.

amd_pminfo (part of CPUID, as I remember) has many flags, only few of them used for temperature report interfaces.
All work based on AMD documentation for BIOS and kernel developers, 09h-16h processors families.

Add support Bristol Ridge, possible Ryzen - need check
Remove TSI reading methods

add Ryzen specific SMU registers.

emaste added a subscriber: emaste.Sep 4 2017, 8:34 PM
cem added a subscriber: cem.Sep 5 2017, 1:07 AM

I have not reviewed the entire driver because I ran into enough issues I want addressed first to warrant a highlevel comment first.

There is no need to change 80+% of driver lines to make these enhancements. There are a few nice changes here it would be good to have, but they are combined with a bunch of gratuitous changes. In fact, the functional enhancements should be split up into a few patches anyway.

I think the idea of scribbling to random PCI dbsf numbers via amdtemp_pci_write / amdtemp_pci_read runs very contra the NewBus style of drivers. But I may be mistaken.

Finally, I am unsure on the value of basically all of the extra register decoding. What use is TTS / HTC?

sys/dev/amdtemp/amdtemp.c
171

Please use C99 initializers here so it is clear what fields the magic numbers are associated with when the structure definition is far away.

179

Not 2731?

239

sysctl naming convention is lowercase

480

This change is a style(9) regression from the previous code.

509

This check seems to defeat the point of trying to read the value from PCI space, doesn't it?

526

This change is a style regression from the existing code.

531

This change is gratuitous?

584

Can pci_io be set to any other value?

639

This change seems gratuitous?

668

What's this number?

767

Gratuitous change?

772–776

Gratuitous change?

783

Gratuitous change?

815

style(9) nit: No space between i and ++

816

style(9) nit: if (sc->sysctl_cpu[i] != NULL) {

822–823

Probably reverse the order of these two?

850

Seems like gratuitous changes? And the extra space contradicts style(9).

rozhuk.im-gmail.com marked 4 inline comments as done.Dec 7 2018, 7:30 PM
rozhuk.im-gmail.com marked 4 inline comments as done.
rozhuk.im-gmail.com added inline comments.
rozhuk.im-gmail.com updated this revision to Diff 51734.
sys/dev/amdtemp/amdtemp.c
179

What is 2731?

526

This is temporary change for ryzen debug.

cem added inline comments.Dec 7 2018, 7:48 PM
sys/dev/amdtemp/amdtemp.c
179

floor(0°C in deci-Kelvins). See r300421 and r285994.

526

That doesn't make sense. It doesn't change functionality; it's purely a style change. How does a style change help debug?

rozhuk.im-gmail.com marked 5 inline comments as done.Dec 7 2018, 9:27 PM
rozhuk.im-gmail.com updated this revision to Diff 51741.

More proper temp calculatuion based on RangeUnajusted, according to "Preliminary BIOS and Kernel Developer’s Guide (BKDG) for AMD Family 16h Models 00h-0Fh (Kabini) Processors".

cem removed a subscriber: cem.Jan 5 2019, 1:25 AM