Tdie/Tccd with K10temp on 6.11 for 9950X

Hi everyone,

Has anyone been able to get a Tccd reading from lm-sensors on Ryzen 9000 and kernel 6.11 (and later)?

Thanks.

Can’t get that here either. 9950X on 6.11 kernel. Just Tctl.

Other option is to get the cpu temp from the motherboard SIO chip either through the specific motherboard sensor driver like asus-ec-sensor or the nct6775 driver reading the nct6799 SIO chip and it’s TSIO_TEMP sensor.

Tctl from k10temp driver is reading package temp and not any of the dies.

Hi Keith,
FYI: I also tried Zenpower and confirmed my suspicion that it doesn’t work with Zen 5 (github repo hasn’t been updated in a while). I believe (someone correct me if I’m wrong) that we’ll be getting full support for Zen 5 in 6.12.

Well I’ve been waiting on the 6.12 kernels with bated breath for several weeks now and none have been available so far without building them yourself. Still running the 6.11.0-061100-generic kernel as the last one that has been available from the Mainline PPA.

Then I was toodling around in the zenmonitor3 directory and looking at the source files and found the file that contains the definitions for Zen 2 and Zen 3 families. So I wondered what would happen if I just inserted the definition for Zen 5 family in the file and recompiled the zenmonitor3 app.

Guess what, it works. Now instead of just posting a “No Zen cpu found” message, the application opens and shows the Zen 5 clocks and powers on my 9950X cpus.

Still not correct for the root version of the app for showing the ‘effective frequencies’ as they are way off. But the root version shows the powers of the cores and the package power for the cpu just like it used to do on my 7950X cpus. The registers that show ‘effective frequency’ must be slightly different from Zen 3 and Zen 4 which worked with my 7950X cpus.

And that was the prime reason I wanted to get zenmonitor working on my 9950X cpus to see how much power it pulls under my normal Boinc loads and under y-cruncher stress tests.

All I did was insert this line into the zenmonitor-lib.c file
#define ZEN3_FAMILY 0x1A
at line 15 after the define for #define ZEN3_FAMILY 0x19

When it compiled, the compiler simply spit out a message that ZEN3_FAMILY 0x19 was being replaced by ZEN3_FAMILY 0x1A

I don’t know enough about programming to properly rewrite the whole package to actually use a proper define statement of ZEN5_FAMILY 0x1A as I would need to fix all the rest of the program to use the correct ZEN5 FAMILY references.

The program properly recognizes the cpu type as that is pulled from the OS for cpu identification.

FYI, zenmonitor3 doesn’t need zenpower3 to access the cpu info for clocks and powers as that comes from the powercap, rapl and msr modules. What you are still missing is the SVI2 telemetry for voltages that zenpower gives you.

So even with kernel 6.12 we still won’t get those values as zenpower still hasn’t been updated past Zen 3.

You also need to load the intel_rapl_common and intel_rapl_msr modules since its the rapl stuff that probes the correct registers. Those modules are what was modified for kernel 6.12 to recognize ZEN 5.
Screenshot from 2024-11-28 14-34-23

Very cool. But we still don’t get temps for the ccds :anguished:

There hasn’t been separate CCD temps since the ZEN 5 redesign which incorporated hundreds of individual sensors throughout the die and AMD just decided to expose a single temp for the package.

At least the case just for Linux. Windows, no issue.

Try inputting this into Google and read its AI response.
“why does k10temp only show Tctl on Zen 5 cpus now?”

The main AMD cpu developer [Mario_Limonciello] for the kernel posted this:
" CCD temperature readings are only valid when ZEN_CCD_TEMP_VALID is set."

Otherwise values are considered garbage.

See linux/drivers/hwmon/k10temp.c at master ¡ torvalds/linux ¡ GitHub for more details.

When k10temp can’t find the correct CCD temp registers, it just falls back to only exposing the Tctl temp.

1 Like

I just compiled the latest development corefreq-cli app which is now getting a handle on Zen 5. Working 9950X reporting and trying to sort out 9800X3D reporting.

Still its sensors panel is only showing the Tctl temps for the cores. No individual CCD1 or CCD 2 temps.

CPU Freq(MHz) VID Vcore TMP(C) Accumulator Energy(J) Power(W)
000 5020.01 177 1.1062 84 000000000000855818 13.058746338 13.058746338
001 4745.23 177 1.1062 84 000000000000801877 12.235671997 12.235671997
002 5199.62 177 1.1062 84 000000000000740284 11.295837402 11.295837402
003 5193.27 177 1.1062 84 000000000000858320 13.096923828 13.096923828
004 5199.71 177 1.1062 84 000000000000831457 12.687026978 12.687026978
005 5199.76 177 1.1062 84 000000000000868686 13.255096436 13.255096436
006 5199.76 177 1.1062 84 000000000000809013 12.344558716 12.344558716
007 5199.76 177 1.1062 84 000000000000883239 13.477157593 13.477157593
008 3840.71 177 1.1062 82 000000000000740306 11.296173096 11.296173096
009 2209.56 177 1.1062 82 000000000000808586 12.338043213 12.338043213
010 4949.31 177 1.1062 82 000000000000826735 12.614974976 12.614974976
011 4453.01 177 1.1062 82 000000000000666366 10.167938232 10.167938232
012 4577.24 177 1.1062 82 000000000000849583 12.963607788 12.963607788
013 4293.80 177 1.1062 82 000000000000763995 11.657638550 11.657638550
014 4371.94 178 1.1125 82 000000000000728267 11.112472534 11.112472534
015 4220.92 177 1.1062 82 000000000000793677 12.110549927 12.110549927
016 5199.76 177 1.1062 84 000000000000000000 0.000000000 0.000000000
017 4610.04 177 1.1062 84 000000000000000000 0.000000000 0.000000000
018 3415.35 177 1.1062 84 000000000000000000 0.000000000 0.000000000
019 4861.80 178 1.1125 84 000000000000000000 0.000000000 0.000000000
020 1771.67 177 1.1062 84 000000000000000000 0.000000000 0.000000000
021 4953.98 177 1.1062 84 000000000000000000 0.000000000 0.000000000
022 4749.24 177 1.1062 84 000000000000000000 0.000000000 0.000000000
023 5199.71 177 1.1062 84 000000000000000000 0.000000000 0.000000000
024 4600.62 177 1.1062 82 000000000000000000 0.000000000 0.000000000
025 5049.72 177 1.1062 82 000000000000000000 0.000000000 0.000000000
026 4767.98 177 1.1062 82 000000000000000000 0.000000000 0.000000000
027 5049.72 177 1.1062 82 000000000000000000 0.000000000 0.000000000
028 5002.36 177 1.1062 82 000000000000000000 0.000000000 0.000000000
029 4359.59 177 1.1062 82 000000000000000000 0.000000000 0.000000000
030 3950.07 177 1.1062 82 000000000000000000 0.000000000 0.000000000
031 4709.49 178 1.1125 82 000000000000000000 0.000000000 0.000000000

         Package[0]    Cores         Uncore        Memory        Platform

Energy(J): 282.339202881 195.712417603 0.000000000 0.000000000 0.000000000
Power(W) : 282.339202881 195.712417603 0.000000000 0.000000000 0.000000000

                          Zen UMC  [14E0]                              

Controller #0 Dual Channel
Bus Rate 3000 MHz Bus Speed 2999 MHz DDR5 Speed 5998 MT/s

Cha CL RCDr RCDw RP RAS RC RRDs RRDl FAW WTRs WTRl WR clRR clWW
#0 32 38 38 38 78 146 8 15 32 8 30 90 8 23
#1 32 38 38 38 78 146 8 15 32 8 30 90 8 23
CWL RTP RdWr WrRd scWW sdWW ddWW scRR sdRR ddRR drRR drWW drWR drRRD
#0 30 23 21 8 1 15 15 1 8 8 0 0 0 0
#1 30 23 22 8 1 15 15 1 8 8 0 0 0 0
REFI RFC1 RFC2 RFCsb RCPB RPPB BGS:Alt Ban Page CKE CMD GDM ECC
#0 65535 312 192 390 0 0 ON OFF R0W0 0 0 1T ON 0
#1 65535 312 192 390 0 0 ON OFF R0W0 0 0 1T ON 0
MRD:PDA MOD:PDA WRMPR STAG PDM RDDATA WRD WRL RDL XS XP CPDED
#0 42 32 42 32 24 7 0:F:1 20 6 18 36 914 23 15
#1 42 32 42 32 24 7 0:F:1 20 6 18 36 914 23 15

DIMM Geometry for channel #0
Slot Bank Rank Rows Columns Memory Size (MB)
#0
#1 32 1 65536 1024 16384 KD5AGUA80-64A320H
DIMM Geometry for channel #1
Slot Bank Rank Rows Columns Memory Size (MB)
#0
Nice to confirm I got the memory configured for 1:1 for the FCLK and UCLK

1 Like

Very nice work and research @KeithMyers. Thank you.
It’s a bummer about Tccd.

Really, the main bit of info from this investigation, is that the temps shown by all the Windows monitoring utilities, is that what they show for core temps is all just ‘garbage’ values filled in for the display.

What is actually shown is just Tctl value across all cores and any variation is just readout noise. Same for their display for supposed Tccd0 and Tccd1 values for the individual dies.

What Linux is doing is to actually show only what is really exposed by the cpu compared to the Windows utilities fabrications.

And FYI, the developer I linked, Mario Limonciello is employed by AMD as one of their main architecture gurus. I would believe what he posts as someone intimately familiar with the cpu design.

Beware any of the Mainline PPA 6.12 kernel releases. They break the Nvidia drivers in a big way and prevent them from loading in the kernel. The system will only boot with the nouveau drivers after building the 6.12 kernels.

Even though the kernel installs with no errors when incorporating the Nvidia dkms modules into the kernel, they are not active. Tried my existing 550 drivers, purging everything and trying the 560 drivers also. None are loaded. So reverted back to my 6.11.0 kernel which still runs.

Also the Mainline PPA is back building kernels correctly and I found the 6.11.10 kernel to still incorporate the Nvidia 550 drivers correctly with no issues. So running that now to get almost up to date with what preceded all the 6.12 fixes.

1 Like