I have a friend with a bit of an overheating issue.
He has two 34" 4K 120Hz monitors for his PC. He is running a Ryzen 5950x and an Nvidia 3090 with a custom loop (2x 360 rads) in an InWin 925. After about 45-60 minutes of game-play, one of the monitors will shut off. It will live like this for another hour or two, or it will completely shutdown shortly thereafter. The GPU is running about 80-90C with the liquid at about 45C. The CPU will also be quite toasty around 90C. He is running Windows 10.
He does not use it for particularly CPU intensive things during the normal course of his day (occasionally he will have to render something in blender, or pull our vectorworks) so I think that he should sell the 5950x and get himself of 5700x3d, or a 5800x3d if he can find one. It is a drop in replace, an he wont have to ENTIRELY redo his custom loop. I don’t know how much this will actually help or if this is a sign of some other known issue with nvidia 3090s (though I couldn’t find anything relevant and recent online).
I also find it curious that the GPU seems to just shutdown on of the displays, then full crash about an hour or two later, is this known behavior for a 3090?
I am rocking Air cooled and way less high-end hardware, so I have no idea.
For me it looks like the loop is clogged or airflow to the radiators is very limited. I was running a full custom loop for 6 years, but I did full maintenance teardowns every 12-18 months. I also used premixed fluid which contained all the bio inhibitors and anti corrosion additives.
2x360mm radiators should be able to keep up with 3090 and 5950X - at full tilt they generate around 500-550W?
Since the rise in temp is slow but steady, I think block contact is not an issue here, as the temp would skyrocket in an instant with poor block contact. Still, it should be checked during maintenance.
Swapping 5950X to 5800X3D is a band-aid at most, since you will get at most 100W (less in gaming) reduction overall and 3090 still will be a furnace that it is.
I really recommend doing full loop teardown and maintenance on it. This means opening the blocks and clearing channels on them, opening the pump, cleaning/replacing tubes (depends on the state), cleaning radiators (very important, since they can hide a lot of gunk), using a premix or distilled water with additives.
If any gunk will be found in the loop, then cleaning every part of it is very important, as it will grew back otherwise.
He cleaned the loop and checked the blocks. This is the only event viewer log from when it crashed again on retesting after 90 minutes.
Warning 10/3/2024 9:49:40 AM DistributedCOM 10016 None
Log Name: System
Source: Microsoft-Windows-DistributedCOM
Date: 10/3/2024 9:49:40 AM
Event ID: 10016
Task Category: None
Level: Warning
Keywords: Classic
User: ZR-01[Redacted]
Computer: ZR-01
Description:
The application-specific permission settings do not grant Local Activation permission for the COM Server application with CLSID
{2593F8B9-4EAF-457C-B68A-50F6B8EA6B54}
and APPID
{15C20B67-12E7-4BB6-92BB-7AFF07997402}
to the user ZR-01[Redacted] SID (S-1-5-21-3644006685-2871440048-3140683947-1001) from address LocalHost (Using LRPC) running in the application container Unavailable SID (Unavailable). This security permission can be modified using the Component Services administrative tool.
There should be loads of entries in event viewer. Clicking on Event Viewer (Local) to bring up the summary helps me sift through the majority with a lot of googling on my part.
To help with minidump/crashdump analzyzing you can use “WhoCrashed” software, which is free for home use. It will find dumps (if they are created) and try to analyze it and provide as much info as possible in a human-readable form.
After servicing the loop, did temperatures improved? Because if loop is saturating with heat that’s not removed and both CPU and GPU go to 90C+ then they will finally shut down to protect themselves.