AMD drivers causeing a fresh Win10 x64 install to freeze (Vega 56)

I’m in the process of putting together a new rig, and getting hard lockup/freezes (ctrl+alt+del does nothing, it just locks up, have to cut power to reboot). Here is currently what is connected to the system (it’s in the test-for-POST stage, sitting on the mobo box):

CPU: AMD Ryzen 1700X
CPU Cooler: [air cooled] CoolerMaster MasterAir MA610P RGB fancy thing that sucked to install because the wiring for the fans is almost unreachable when you have non-presidential man-sized hands.
Mobo: MSI Gaming M7 ACK (latest BIOS as of time-of-post installed) – WiFi card that came with the mobo is not installed at this point, for troubleshooting purposes I did a fresh install without the WiFi module installed to eliminate the need for Windows to install drivers for it, in case that may have been the root cause.
Memory: 4x 8GB Corsair LPX 3200MHz (these modules are on the QVL for this motherboard)
GPU: Gigabyte RX Vega 56
PSU: Corsair HX750 Gold (the semi-modular model)
Storage: Boot drive is a WD Black 512GB M.2 NVME, and an ADATA SX7000 512GB M.2 NVME is the other M.2 slot

So, a fresh barebones install of Windows 10 x64, I’m talking downloaded the ISO from microsoft, flash it to a USB thumb drive and installed (luckily with USB 3.0 and M.2 NVMe this only takes about 5 minutes, because I’ve done it 6 or 7 times at this point). Absolutely no other software installed other than the latest AMD drivers and whatever drivers Windows installs by default. I’m getting hard lockups/system freeze shortly after logging in to windows 10. Hard lockups - the whole thing just freezes, no ctrl+alt+del, no nothing. I have to cut power either by shorting the Front Panel power pins on the mobo or turn off the PSU.

Event viewer, to my knowledge (navigating those logs is a goddamn mine field), isn’t reporting anything substantial – unless I’m looking in the wrong category (I’ve checked System Events, and the general Critical one – i forget what they call that). The only thing it reports is the unexpected shutdown, at least that I can find. May need some tips & tricks for where to look in Event Viewer as it’s not user friendly in the least bit.

Troubleshooting done so far:

  • I let Memtest86 do an 8-pass RAM torture test last night, it reported 0 errors, so I don’t think it’s bad memory modules.
  • Burned about a half an ounce of very good weed contemplating how I’m going to explain to my old lady that the $850 graphics card I bought doesn’t work, and that I may not be able to use it, and Newegg may fight me about returning it. I might be homeless next week, in which case none of this will matter.
  • I booted to an Ubuntu 17.10 Live image and it functioned fine, idle and under a load, for 3 hours without freezeing. This leads me to believe it’s a software issue – either Windows 10 shenanigans or with the AMD Windows drivers.
  • Currently I’m letting it idle after a fresh install of win10 BUT NOT installing the AMD drivers, to see if I still get a lockup. So far (15 minutes in) it seems to be OK. 15 minutes is much much longer than I got with the drivers installed.
  • Currently the connected display is a big beast of a monitor (HP Omen 32), connected via DisplayPort. During troubleshooting I’ve tried connecting it via HDMI instead, and tested having freesync both on and off. Also tried a totally different display that does not have Freesync. Nothing had any effect, still locks up with AMD drivers installed.
  • Via the process of elimination (my favorite troubleshooting technique) this is leading me to the conclusion that the AMD drivers for vega are… well, shit. Or they’re just buggy. But, they could be shit.

Now look, I’m OK with Linux, as a matter of fact that’s my OS of choice, but this is supposed to be a gaming machine – I paid thru the nose for the damn card. unfortunately for gaming – if you want to play the newest titles, Windows is your only option realistically. I say this as a somewhat experienced Linux user and fan of that OS, but for gaming it’s just not the best choice (hopefully this changes more quickly with the advent of Vulcan, etc.).

I’m curious if anyone else has had any issues with the x64 Win10 AMD drivers? I can’t for the life of me figure this out, I mean it’s brand new hardware, fresh out of the boxes. The RAM is good insofar as I can tell, I don’t think that’s the issue. BTW, 25 minutes into the fresh Win10 install without the AMD drivers, still not frozen. I contemplated calling AMD tech support, but honestly calling tech support for any company these days is #1 a waste of time and #2 an exercise in frustration and #3 sometimes I can’t even understand their thick Indian accents (no offense intended, it’s just reality). Plus they probably wouldn’t tell me anything useful anyway, I’ve already done most of the stuff their scripts likely say to try.

EDIT: ~ 45mins booted up without AMD drivers installed, no lockups yet.
EDIT 2: Getting close to 1.5 hours of on-time with no freezes or lockups. I’m 85% sure at this point that the issue is with the GPU drivers. I could be wrong however, who knows.

Ok so if the GPU drivers are the problem… what is the end-user recourse here? There isn’t another option (on Windows) for drivers is there?

If anyone has any advice, troubleshooting tips, commands to run, logs to look at (event viewer tips would be appreciated, I am not well versed with that tool), etc. ANYTHING I’m listening, and it would be very much appreciated. Thanks in advance.

1 Like

Memtest, had it’s time 10+ years ago. Using it now, doesn’t prove anything. And basically is overdoing the whole troubleshooting and in the wrong area on top of that. Could easily still be memory issues, try lowering the speed on the ram. And look up settings for the ram in general, there are also brand new bios updates for ryzen which seem to run more stable.

Don’t recommend memtest to anything remotely modern please, which have far better options like a QR code display or even a buzzer (the mini speaker, which beeps 4 times for memory errors on ryzen) Takes less than minutes, even seconds to tell if something is wrong vs 8+ hours and STILL doesnt prove ANYTHING…

Are you sure? Because I could only find 3200 Corsair validated for two dimms.
Use half the memory, set them to 2933 and fixed 1.35v.

The PCIe power cables are connected on both sides? (I did that once)

What PSU are you using?

PSU is a Corsair HX750 (the 80 plus Gold, semi-modular model)

During more troubleshooting (after I typed up my original post) I tried just that with the RAM. I disabled any XMP profiles, and tried all manner of combination of 2 DIMMs, 4 DIMMs, etc. I didn’t re-do the Memtest86+ because that takes forever, but I did run a few of the memory test options on AIDA64, no issues. Even with all 4 installed I didn’t get any errors.

And I’ve done the same thing with power cables before, a dumb mistake in retrospect for sure but I’d bet it’s a fairly common thing – there’s a lot there that can be overlooked. I did triple-check that all PCIe power cables were properly seated & connected, as well as mobo 24-pin and the supplemental CPU power (EPS – i think, or whatever that’s called)

Some more testing I’ve done since the original post: Removed the Vega card, installed a donor XFX RX 580. Did the same exact process as before (fresh nuke&pave, nothing installed but bone stock win10 pro, booted up start edge one time, download gpu drivers from amd.com, install them reboot and hope for the best. Annnnnnnnd it’s been running AIDA64 GPU stress test for ~35 minutes… still chugging.

Via the process of elimination here i;m heavily leaning towards the Vega card being the issue and causing the lockups, but only with the 18.3.3 drivers installed – I get no problems with the default Windows graphics driver. This is pushing me towards a driver issue… I am at a loss as to what else it could be.

The only other thing I can think to try is putting the Vega56 in another machine, and seeing if it locks that up too, in which case it could be faulty hardware I guess, but at this point I would probably put money on it being a driver issue – a conflict of some sort maybe I dont know, but I’m fairly confident it’s a software issue, but I could be wrong – maybe the Vega card is just faulty that’s not unheard of I don’t think. Unlikely – I assume they factory test these video cards they sell you for the price of a beater automobile – but it’s not impossible.

OK, what would you suggest to torture test the RAM then?

I also let AIDA64 hammer that shit into the nether and didn’t get any errors either.

I encountered a similar problem yesterday with the latest windows update + 18.3.3 crimson drivers.

2x RX580 and had BSOD’s on boot twice.
Only during making a clean driver install did it manage to lockup with a black screen.

There might be something to the drivers angle or there might not.

One would need more data to get an idea.

I’m currently booted up in the original configuration (vega 56, all RAM, etc.) and running balls to the wall without issue using v18.2.1 – which was the only other download option on AMD’s site other than 18.3.3. It seems stable I haven’t gotten any freezes on this driver – nor did it ever freeze on Ubuntu 17.10, Fedora 27, or even on Windows 10 with the default graphics driver installed.

I also tested the card in a different machine – but I can’t say that’s an apples-to-apples comparison because I’m unsure if the motherboard in my other rig has PCIe 3.0 or not, and I don’t know if that even matters or not.

I’ve tested it every way I can think of from a hardware standpoint (given the hardware I have access to right now). I’m quite sure at this point that the issue is the 18.3.3 drivers, the older ones seem to be OK under synthetic benchmark and stress test. I haven’t tried any games yet (work and kids and all that) but I’m going to try and get some game time in tomorrow evening. I’m quite excited it’s spring break for the kid so he’s off school next week we might get to play some games together! And if I have to play Ark on Xbox One @ 22fps any longer I might lose my mind

2 Likes