Weird thermal issues on my recently-purchased Ryzen 5 1600 AF

Motherboard: MSI B450 Tomahawk Max
Cooler: Scythe Mugen 5 rev. B
CPU: R5 1600 AF
RAM: 3200 Mhz Corsair Vengeance
OS: Manjaro KDE

TL;DR @ bottom

I built a new computer in February of this year. Everything went well and I had no discernible issues.

Once the weather starting warming up I realized my chip was running hotter than it should be because I was experiencing frame dips in games occasionally. I posted on Reddit about the potential causes could be. The consensus was that the thermal pad I was using (IC Graphite) could be bad or I mounted my cooler improperly.

I re-seated my cooler and used a traditional thermal paste. The temperatures dropped slightly (from a max of ~98Ā° C to about a max of ~95Ā° C, still way less than ideal).

My next thought was that this has to be a strange boost issue, perhaps my voltage was boosting beyond the rated limit.

I initially decided to manually set my clock speed to 3.4 Ghz with a manual voltage of 1.05 V, but decided to manually set clocks to a max of 3.4 Ghz with set a voltage offset of -.1 V.

While my temps are slightly better (I donā€™t drop frames anymore), the problem persists. I donā€™t think itā€™s a hardware issue anymore. Or if it is, itā€™s not obvious. It could be software, but I have no proof to back it up. Iā€™ve added some pictures with a description of whatā€™s running during the graphs.


Program: Blender
Core Performance Boost: Off
PBO: Off
Manual Vcore: 1.05V
Manual Core Clock Multiplier: 34
This screenshot actually tells a story that I donā€™t quite understand. The core temperature graph indicates that the temperature is climbing rapidly over the course of several seconds. However, from the clock frequency graph you can see that I didnā€™t even start the Blender render until right before I took the screenshot. The temperature of my CPU was spiking up when the system was basically just sitting on Blender and a web browser.


Game: Mordhau
Core Performance Boost: Off
PBO: Off
In this example the temperature was already held stable at ~47Ā° C when it spiked up into the 90s.

I canā€™t put more pictures (despite being able to upload them) because my account is new. I canā€™t post a link to an imgur album because the editor wonā€™t let me, so there is some valuable information that I canā€™t put here :slight_smile:

TL;DR

  • My CPU seems to want to idle at ~32Ā° C, game load at ~45-50Ā°C, and all-core load at ~55-60Ā° C.
  • Randomly the CPU Tctl will spike in ~7-10Ā° increments all the way up in the mid 90s and will stay there usually for a minute or two before slowly coming back down. This happens even if left idle. It doesnā€™t stay that hot for more than a minute or two
  • The voltage and clock speed donā€™t spike when the temperatures do. This makes me think this is not a boosting issue.
  • There are no abnormal background processes that are running according to htop which would make me think that itā€™s a software issue
  • My GPU never goes above high 50s C and I can feel the air my case fans are blowing, so I donā€™t think itā€™s choked for air

Iā€™m at a loss. Everything went so smoothly in the build process and I have to admit Iā€™m getting annoyed at these thermal issues. I bought a cooler way more powerful than needed for my processor so it would be silent. But it appears I have an issue with my CPU.

Does anybody have any ideas?

Thank you all SO MUCH for reading this if you made it this far. Seriously. I know itā€™s a wall of text, but itā€™s greatly appreciated!

EDIT: added ā€˜amdā€™ flair to help people searching in the future

Iā€™d try these things:

  • Try installing Windows (just run it with the watermark for tests, no need to buy a copy) and see if under the same circumstances the same spikes happen out of nowhere.
  • Reset the BIOS to default and see if something changes.
  • If the two previous tests didnā€™t lead to a change in behaviour Iā€™d try to update the BIOS, if possible, to see if that solves the issue (or overwrite it).

While testing on Windows, if things stay the same, Iā€™d suggest you to log all the voltages of your motherboard using HWiNFO64 because my spider senses are telling me that your issues might be caused by some ā€œsecondaryā€ voltage like PLL or SoC (or how itā€™s called for Ryzen).

2 Likes

I appreciate the feedback, but Iā€™m not going to install Windows. This has been many peopleā€™s first suggestion on other platforms, but I donā€™t use Windows for the same reason you donā€™t use Linux. I just donā€™t like it.

Also, my apologies for not mentioning it in the initial post, but I already reset my BIOS and reset my settings. After I did that I updated the BIOS from the board-shipped BIOS (November 2019) to the most recent one (April 2020). Resetting and updating the BIOS did not fix my issues, unfortunately.

I did take your suggestion and also paid attention to readouts of the SoC voltage alongside the Vcore and unfortunately I donā€™t believe that is the issue either, as it stays stable and never going above 1.125~ V at the max, usually staying around 1.1 V.

These temps seem perfectly fine, which makes me think that this

is either a weird voltage thing or a weird hardware thing.

Are you able to watch temps in BIOS? If yes, does it spike in BIOS as well?

Also if you can set CPU fan curves set it flat at like 80% speed or something, it could be the fan not adjusting speed correctly.

2 Likes

I didnā€™t mean to suggest you to swtich OS, just find a junk disk to spin up an installation just for testing. Also assuming things about others is generally not a nice thing to do.
The intent of the suggestion is to exclude software issues. You can do the same launching a clean Linux distro and do your testing there.

The only thing I can think of, beside an hardware issue, is that you have a process running in the background thatā€™s hitting the CPU with an AVX2 load which is making your CPU spike very aggresively. Whatā€™s the CPU load when the temperature spike happens? Does it happen while idling on the desktop after a cold boot?

Yeah, actually just a couple hours that idea came to me and I sat in BIOS for about 15 minutes to see if it would happen, but it didnā€™t. It stayed at an idle temp in BIOS at 34-36Ā° C.

That sounds to me like it may be more indicative of a software issue, then? What do you think?

Youā€™re right. Iā€™m sorry. Iā€™ve never not been able to fix an issue like this before. Iā€™m just frustrated and I shouldnā€™t have said that.

The CPU load varies when the temperature spikes happen. It can happen after a cold boot and Iā€™ve been sitting idle for a few minutes (usually less than 2% CPU utilization), during a game (~20-50% CPU utilization depending on the game), and during Blender renders to simulate an all-core workload (near 100% CPU utilization).

1 Like

Yeah, if it stays consistent in BIOS and not after loading the OS Iā€™d say itā€™s most likely either some software thatā€™s running or the OS itself.

Booting a live USB of another distro may be a good idea. Let it sit for a while and see what happens.

It could even be the monitoring program not interpreting the thermal sensor data correctly, Iā€™ve had that happen before.

1 Like

Iā€™ll boot another distro from a USB and monitor some temps for 15 or 20 minutes some time tonight if I have time, or tomorrow night if I donā€™t.

Thatā€™s what I had initially thought, but I was dropping frames at the same time the temperatures spiked. The Mhz and Vcore also tanked at the same time, so it definitely was throttling.

Hereā€™s a picture of the graphs when it was throttling (before I messed around in the BIOS to stop it from boosting so high).

I couldnā€™t include this in the original post because of upload limitations for new users.

Based off the data Iā€™ve seen so far (low voltage high temps) Iā€™m starting to think you have to much or to little tension on one corner of your cpu cooler. That particular cooler has a somewhat complex mount as far as even pressure is concerned. And from the pictures the rapid climb is whats driving my conclusion. If perfect mounting is obtained then something we are monitoring is not reporting accurate data whether that be voltage, temperature, ect.

1 Like

He already said heā€™s done that in the OP

1 Like

Iā€™m going to guess you have a severe lack of airflow or a cooler that canā€™t manage.

What case? How are your case fans configured?

1 Like

Iā€™ll reseat my cooler one more time before booting another OS and viewing temps in different software.

I have 2 intake fans (one of them is on a hybrid liquid cooled GPU with a 120 mm rad in the front) and one exhaust fan in the back. Iā€™m using the Phanteks P400 A Digital (the one that Gamers Nexus gave the best all-rounder award). I donā€™t think my air flow is lacking. If I put my hand on the front of my PC case I can feel the air being sucked in.

Also if it were an airflow issue my GPU should be getting hot as well, but it isnā€™t.

Apologies for not being able to update with more data from further attempts to fix the issue. Iā€™ve been busier than I anticipated recently. I should be able to check temperature in another OS some time within the next week, probably this weekend.

Any follow up Iā€™m always curious how these things turn out.?

My apologies for the delayed response. No follow up yet. I thought I had some decent free time and then life got super busy. I finally have some free time tomorrow, so I can guarantee that by tomorrow night I will have done two things:

  1. Re-seat the cooler
  2. Monitor temps in a separate OS

Iā€™ll update with results very shortly afterwards!

Very sorry for asking for advice and then just dipping. That wasnā€™t the intent lol.

Hello all! Iā€™ve got a somewhat surprising update.

I havenā€™t been monitoring my temps much. But today before I decided to do some troubleshooting I wanted to monitor temps for a couple hours before as a sanity check.

While the problem still persists occasionally, it is MUCH less frequent than before.

This is what would happen before:

  • Normal temperatures for given load for ~2-5 minutes
  • Massive temperature spike into the 90s for about a minute, maybe two
  • Temperature returns to normal, rinse and repeat

This is what happens now:

  • Normal temperatures for given load for ~5-10 minutes
  • Abnormal but not awful temperature spike into the low-to-high 80s for about a minute, maybe two
  • Temperature returns to normal, rinse and repeat

I booted up the current ISO of Pop_OS on a usb drive and installed lm_sensors for temperature monitoring and it appears to follow this exact same pattern.

Iā€™m curious as to what could cause this sudden change in behavior? It definitely seems to be a hardware issue nowā€¦ But how? Iā€™ve been rattling my brain over this for the past half hour or so.

Is it possible that perhaps both times I seated my cooler that I tightened it too much, and that the time Iā€™ve left it to sit itā€™s trying to normalize? I canā€™t think of anything else. Iā€™m probably going to re-seat my cooler soon without tightening it as hard as I did to test it.

When you do get time to reseat the cooler, maybe take a picture of the thermal paste and bottom of the cooler. It may reveal something to smart people here.

1 Like

While most coolers are pretty good about ignoring some inequal tension some are not (to include dealing with not so perfect cpu surfaces). If one side is a tighter than another it can absolutely cause hot spots on the opposite side of the cooler. This was easy to demonstrate on the stock intel cpu coolers with the plastic tension push pins. If you release a single pin you would have the exact cpu thermal anomalies you describe. With that said focus on even tension if you decide to reseat the cooler and follow an x pattern with equal turns to ensure even tension.

1 Like

Hey guys. An update.

I re-seated my cooler yet again and I took pictures of the distribution of the tim.


I ran a ~7 minute Blender render to warm up the paste before removal. This is the dirty shot of the cooler.


Dirty shot of the IHS.

I donā€™t think the issue was uneven tension. It looks like it had an even distribution, right? I know it doesnā€™t go all the way to the corner on every corner but I think it looks like it was enough, right? Iā€™m not sure.

As I re-seated the cooler this time I decided to re-use my IC graphite thermal pad to test whatever slight differences there may be between it and the paste I was using. As suspected, it does run a few Ā° C hotter than the paste but nothing insane that would really deter me from using it for now. Iā€™ve reused it because finding isopropyl alcohol in my area is somewhat of a crapshoot and Iā€™ve only got half of a small bottle left.

Also, the updated behavior from before is still persistentā€¦ So the temperatures still boost less frequently now than when they did when Iā€™d initially made this thread. It still doesnā€™t thermal throttle anymore.

Whatever the problem is itā€™s not getting any worse.

I think, unfortunately, that I will have to take everybodyā€™s suggestion of using an old hard drive to test out what Windows 10 says in HWInfo64 for a potentially closer look into this problem.

@ThatCoolNerd: what application are you using in the first screenshots? I use Linux too but never found a monitoring program that is so detailed.
I do have the same CPU as yours but I donā€™t think I have your problems with it.
I would love to have a good monitoring program in Linux, just in case I need it. And perhaps this would be useful for others too.

1 Like