Wondering if anyone can provide assistance on trying to narrow down the root cause of this issue.
Complete PC Specs:
AMD Ryzen 7 7800X3D
Asus ROG Strix B650E-I (BIOS version 2413)
Sapphire RX 7900XT 20GB
32GB (2x16GB) Corsair Vengeance DDR5 PC5-48000 (EXPO set to 6000)
1TB Crucial P3 SSD
750W Corsair PSU
Primary monitor - 3440x1440 Acer Predator x34a
Secondary monitor - 1920x1080 iiyama
Ever since I built this system back in December 2023, I’ve had intermittent graphics errors/crashes while gaming. This shows as both screens completely freezing for a few seconds, the screens going black, then flashing as the graphics crash and recover.
This shows up in the Windows Event Viewer as ‘Display driver amduw23g stopped responding and has successfully recovered.’ or in Linux as ‘[drm:amdgpu_job_timedout [amdgpu]] ERROR ring gfx_0.0.0 timeout’
At first I thought this was a driver issue and would be fixed in the next version, but after 6 months of updates and troubleshooting, I found the main culprit, the Resize Bar setting in the BIOS. Disabling this has had me stable for the first time in months.
Troubleshooting to get to this point:
Multiple installs of Windows 11, Pop_OS! and Nobara with crashes on all three OSes
Replaced SSD with another model
Replaced PSU with old 850W model in case power draw was the issue
Sent GPU back to the retailer for extensive testing but the issue couldn’t be recreated
Now that I’ve finally figured out what appears to be the setting causing the issue, my next step is to figure out why. In theory, if it’s Resize Bar causing the issue, maybe it’s a problem with part of the GPU VRAM?
So I stress tested the GPU VRAM with OCCT and an overnight run of memtest_vulkan, but I couldn’t find any errors at all.
I’ve been banging my head against this issue for quite a while now and kinda stumped at whether it’s a software/driver issue of some kind, or an underlying hardware issue that I’m struggling to prove.
If anyone familiar with Resize Bar has any tips or insight, it would be much appreciated.
So you are on a recent BIOS revision for that motherboard. I’d try updating it to the latest BIOS version. I’d make sure everything is properly set within the BIOS for rebar per ASUS manual. You might need a vBIOS update for the GPU(you shouldn’t but it’s possible). Not sure how Sapphire handles that anymore and I’d be very careful when doing so. Beyond that I’d reseat the GPU and power connectors on it.
If the above does not work then I’d reach out to Sapphire support. Possible your GPU has a defect. Could also be motherboard or CPU but I’d think those to be a lot more unlikely.
If the GPU was sent in and the problem couldn’t be replicated on a different machine, this leans me toward something with the motherboard’s handling of it. First thing I would do is update BIOS if not already on the latest. Maybe even contact ASUS support if it is; I’ve had times when the latest bios was not put up on the product support site.
Also not to insult your intelligence but you made sure it’s enabled in driver right? Shouldn’t cause crashes even if it’s not, but weirder things have happened. It’s called “smart access memory” in the AMD driver software, IIRC it’s under the settings tab.
Just to clarify about BIOS version, this issue has followed me through the various BIOS versions throughout the year, I think I missed including that in the first post. The latest version just has support for newer CPU models, nothing worth updating for.
I think you’re quoting steps from the article on edgeup Asus. It’s good advice if you’re setting it up for the first time, but not what I’m having issues with.
I’ve reached out to Sapphire before and their support is generally unhelpful, just telling me to speak to the distributor, which I’ve been in talks with for months, but there’s only so much they can do as a third party. So yes, there might be a defect somewhere, but how to prove it is the end goal.
Although I’ve been speaking to Sapphire and the distributor here in the UK regarding the issue, it’s true that maybe a chat with Asus might reveal something.
As for Smart Access Memory, you don’t need to enable it in driver software. I don’t install Adrenalin on Windows and it’s not a thing on Linux.
Just to test, I ran GPU-Z on my Windows 11 install and can see it’s enabled:
While sanity checking settings from both replies I did find the chipset driver was outdated in my Windows install, so we’ll see if that has any bearing on stability, but again doesn’t help my Linux issue.
I’ll see what Asus says and post any developments here regardless.
Nope, just standard troubleshooting/PC janitor stuff.
Hmm… that’s frustrating and surprising that Sapphire won’t provide support. But I’m guessing your in a region where this is standard practice or you bought from vendor with that in the contract.
If you would like to prove it easily then your best bet is to get another GPU that supports rebar and test with that. If the issue occurs with a new GPU then my money is probably the motherboard and maybe the CPU.
I do have an older RX 5600 XT, but I believe it’s only 6000 series and above that support it. Unfortunately I’m not made of money so that’s a no go lol
First, I would enable the Resize Bar. Then, I would check for any related settings in the BIOS that might impact GPU performance or stability. These could include PCIe settings or other advanced configurations.
Other than CSM, there’s nothing else related to Resize Bar that the BIOS cares about
First, I would enable the Resize Bar. Then, I would check for any related settings in the BIOS that might impact GPU performance or stability. These could include PCIe settings or other advanced configurations.
I think you’re referring to ‘dirty power’ right? While I’m sure it exists here in the UK, I don’t think this is an issue. I’ve replicated this when I took my PC into work to troubleshoot on the side.
Fifth, some GPUs require a VBIOS update to fully support the Resize Bar. Check Sapphire’s support or website for available VBIOS updates for your RX 7900XT.
No updates required for this card
The only other advice I could add is if you choose to update your RX 7900XT, I would only get the VBIOS from Sapphire; I also wouldn’t update the RX 7900XT VBIOS unless disabling Resize Bar was troubling me because if you do, you will void your warranty on your RX 7900XT.
Not true in the UK, unless you mean flashing a random VBIOS not from Sapphire.
So update on this issue so far, I found the Chipset drivers in Win11 were quite out of date. For some reason the AMD driver software was providing an older version that what was available through the Asus Armoury Crate software. Strange, but updated everything, including the BIOS, to the latest version.
A few days later, the crash happened again multiple times. So if it’s driver related, it’s not fixed at least.
Reached out to Asus to see if they think it’s motherboard related in any way and to paraphrase the chain of responses, as long as Resize Bar is enabled, CSM disabled, it should work. If it doesn’t, send your motherboard back to the retailer for testing. They also mentioned that technically the motherboard doesn’t support Linux so for that issue, go pound sand.
I’ll be going back to Sapphire and seeing if they have any better suggestions after the troubleshooting I’ve done since the last time I spoke to them.
Nevermind, I think the Resize Bar troubleshooting is a red herring. After doing some PC maintenance I thought I’d give myself a fresh Windows 11 install and start from scratch.
After getting everything set up, Resize Bar is off, I just want a peaceful Sunday of gaming and I get:
I was had HWMonitor open at the time and can see nothing out of the ordinary:
So deep breaths and time to go back to Sapphire and see what they say.
@Lirrachus, I am a network engineer. I know very little about hardware, so Chatgpt generated my last post. I see you tried what Chatgpt suggested. I am all out of ideas.
Come on man, don’t answer forum posts with ChatGPT
World of Warcraft is the main game I play so going off some of the suggestions from @wendell in that thread, I’ll leave HWINFO64 running and see if I can recreate the issue.
Interestingly this time It showed up in event viewer as a DWM process crash rather than the driver as before, but the crash was the same experience for me, which is:
Screens Freeze
Fans on graphics card spin up
After 5-10 seconds of freezing, the screens flash and eventually recover
2nd monitor via HDMI isn’t recognised and need to replug in the cable to get it to re-register as such
I’m going to look into these logs a bit more but if anyone is familiar with reading these let me know.
The only time I get a crash on a card is heat issue. Never had a crash unless that was it. You said the resize bar was causing issues and heat can create hidden issues.
insufficient power supply. Replacing PSU with a bigger/better quality unit fixed the issue
I thought the same as I was running a 750W PSU to fit in my case, however even using my old 850W PSU has the same issues. I’ve been monitoring the power draw and the wall and the max wattage at the wall has been roughly 430W.
over tweaking my memory configuration (which would include FCLK settings). Dialing back tweaks resolved the issue
No manual tweaks, only EXPO to run at advertised speeds.
In your case … Maybe your motherboard is suspect. Maybe force GEN4 on the PCIe x16 slot and see if that helps …?
Are you using any cable extenders INSIDE your case? … ie … PCIe riser cable …etc.
This is a good shout. I already know it’s running at Gen4 speeds but I’m using a Fractal Terra with a riser cable. The only thing that throws a spanner in that suggestion is that I’ve tested with an older 5600XT in the same slot and had 0 issues. Worth ruling out however.
The resize bar was a red herring as I said earlier in the thread. I had some succes with it disabled and lead me to believe this was the root cause but it’s not.
You can also see the temperatures in the screenshots. 70 degrees shouldn’t be cause for concern so I don’t see anything to lead me to a heat issue.