So, after assembling my latest build (after gathering parts for about 6 months) I played a couple of games on it and did some work. All seemed fine until it randomly hanged for no apparent reason and I had to hard reset it several times to get the machine to boot up. Before I continue, let me just list some specs so you can have an idea of what I'm working with here:
- AMD FX 8350 @ 4GHz (stock settings, Family 15, Model 02, Stepping 0, Revision OR-C0 voltage 1.26v)
- 16Gb Corsair Vengeance Low-Profile DDR3 1886 Quad-Channel Kit (4x4Gb Modules) running its XMP profile using the D.O.C.P. feature on my motherboard (link: http://www.corsair.com/en/vengeance-low-profile-16gb-dual-channel-ddr3-memory-kit-cml16gx3m4a1866c9)
- Asus M5A99FX Pro R2.0 (Running the latest 2301 BIOS)
- Noctua NH-D14 cooler (the slightly older one that comes with the 120mm and 140mm 3-pin fans, I'm running both the the low-noise adaptors because my motherboard does not want to voltage control the fans and I can't stand the noise)
- Sapphire Radeon HD 7950 Vapor-X with BOOST (the boost button is turned off)
- 256Gb OCZ Vertex 4 SSD with the latest 1.5.1 firmware installed
- New 4Tb Seagate and one oldish 1Tb Seagate SATAII drive
- 1000W Cooler Master V1000 Power Supply (I was planning on doing some overclocking and maybe running a Crossfire/SLI setup in the future, we'll see)
- I'm running Windows 8.1 Update 1 with the latest Windows updates installed (as of today)
- I'm also running the latest chipset and graphics drivers (14.4)
Some irrelevant, but perhaps relevant information:
- I'm using a corsair M95 mouse
- I'm using a Das Keyboard Professional Silent (with the PS2 connector for n-key rollover)
- I've got a pair of monitor speakers attached to the onboard soundcard and I mostly listen to music using my cheap Syba USB DAC & AMP. The onboard drivers are up to date and with the DAC I just use Windows' plug-in-and-play driver.
- I've got my power supply hooked up to a generic-brand 2000VA UPS, since I don't have faith in my house's electrical wiring and we do get a lot of power outages where I live.
- I've got an Antec P280 case with 5 Noctua NF-F12 fans attached to it (2 blowing air out on top, one exhaust fan at the back and two intakes at the front). I've set my cooling profile in such a way that the fans run between 0 RPM and 1100 RPM (since I find this acceptable from an auditory point of view).
- I'm also running a very unconventional dual-monitor setup: my Dell U2713HM is my main display with my old Acer G235H running as a secondary display (this is hooked up to my graphics card using a DVI-HDMI adaptor).
Now I will admit, I never checked to see if the RAM kit I bought last year is compatable with my current board and I could not find it on ASUS' QVL list, so it probably isn't approved but I've never had problems with Corsair's RAM in the past. I've also had the strange problem of the computer not waking up from sleep (or at least the screen not waking up) since I installed my HD 7950 in my old machine (FX8120, Asus M5A97 Evo board).
I've surfed AMD's support forum for hours and this turns out to be a common problem either caused by a driver issue or the GPU simply not getting enough power to wake up). I've avoided this problem by simply not allowing the screen or the computer to go into sleep mode. If this happened in the past, I had to physically restart the computer (occasionally I had to do this repeatedly or switch it off, wait a while and turn it on again) in order to get it to show something on the screen and post. Just tweaking my power settings and not going into sleep mode seems to fix this.
Now, before I digress even more, let me get back to the original topic. My computer froze after I clicked on the start menu. At first I thought I'd wait a bit (since I had Firefox, Excel and SAS opened as well) and after about a minute nothing happened. I moved my mouse and the cursor was stationary and I pressed CAPS-lock on my keyboard (it's an old trick I use to see if I have a software or hardware problem) and sure enough it wasn't switching on or off, on the keyboard. There also was now BSOD and I have Windows set to show the screen and save the result when this happens.
After resetting the computer, I had a similar problem to the one I had in the past where nothing would show on the screen and you have to switch it off, let it rest and switch it on again before it showed anything. My case doesn't have a PC speaker and the board's graphics card warning LED was burning (which is kind of obvious since both screens were black). Eventually, I got something to show on the screen again and the computer booted into Windows. I haven't had this problem as of yet (about 2 days and counting).
From my experience with computers (and from some other forums I've visited) this is caused by either one of the following:
- the motherboard is making a short;
- I've got a bad processor;
- the graphics card is bad;
- the RAM is bad or
- the power supply is bad
So, to diagnose the problem, I'd thought I'd have a look at each of these components. From what I've seen by looking at the BIOS and HWMonitor's logs the 12V, 5V and 3.3V are within 5% of the ATX specification and they don't seem to fluctuate radically or drop during certain events. In my mind that rules out the power supply. I don't have a voltage meter or anything fancy, so I can't do any more sophistcated testing than that. I also made sure that all the connectors are propperly connected to the components. No problems there.
My approach to seeing if the motherboard is making a short wasn't very scientific: as far as I could tell there weren't extra motherboard risers or bare wires crossing. I did accidentally spill a few drops of thermal compound on the board (Cooler Master IE Essential C1, which isn't conductive as far I know) in the past, but I immediately removed it with isopropyl alcohol and it was on one of the heatsinks. So, I wouldn't rule out a short completely, but I'm about 70% sure it's not a short.
The next step is testing the RAM, since that's reasonably easy to replace or fix. I read that you could spot bad RAM 90% of the time using 1-2 passes on Memtest86/Memtest86+ and the Windows Memory Tester. So I ran 2 passes using all the tests on these programs and the results were good. So, I thought I'd run Memtest86+ overnight, since most people seem to regard that more highly than Memtest86 and the Windows Memory Tester. It ran for about 10 hours and I got a error the next day on test #8 during pass number 6. According to the output, the problem is relating somewhere in my CPU. So I thought that I would run IntelBurnTest and AMD Overdrive's Stability Test tonight. The CPU passed the standard IntelBurnTest with 10 passes. My temperature was a maximum of 52 degrees celcius (which isn't bad, since South Africa is a pretty hot place even during "winter").
If you made it to here, thank you for bothering to read everything I wrote. What I want to know now is whether or not I've done enough to rule out the CPU/RAM? Should I perform more tests? Is there anything else I can do? I can't specifically tell you what causes the PC to hang or when it will hang, since it happens randomly. Even the graphics card would fail to wake it randomly, about 4 out of every 9 times on average.