Hi guys, after some trouble having stuttering issues due to DPC latency my linux install started flooding me with messages stating some hardware maybe at fault. These are from my dmesg log:
And it keeps adding some more after some time, 1 time I had more than 1500 errors.
Googeling the error message brought me not much information wise, Nvidia says error Xid 61 is an Internal micro-controller breakpoint/warning. Source: https://docs.nvidia.com/deploy/pdf/XID_Errors.pdf
Stating what video card you have and what you're using it for would help. Put it in some windows environment and runs some tests to see if it crashes, if not it probably isn't because of the gpu.
It's a GTX 670. I had some freezes, black screens and artifacts on Linux. And black screen on Windows as well. But not all the time. Yesterday I was reading some webcomics and poof screen went dark. After reboot still dark and then it ran normally for the rest of the evening. I just finished building my new PC so I cleaned all the older components I wanted to reuse so dust isn't an issue. Searching for this particulary error doesn't bring up any useful information
I had a GTX780 fail. EVGA played 20 Questions with me until I said the A-word. Just as soon as I told them that I had visual artifacts on the monitor, they told me to box up the GPU and ship it to them.
I tried everything you suggested, I even tried a different PCIe slot. At first I thought it could be my motherboard beeing that my videocard never had any troubles before. Not in this frequency at least
If it is no longer under warranty, then you have nothing to loose by taking it apart. The reason my GTX780 failed, is because EVGA applied thermal compound to only 2/3 of the GPU die. I assume this caused the card to slowly cook itself to death. Months after I purchased it, it began crashing, when under heavy load, with a black screen. Yes, it functioned flawlessly in that condition, for months! I eventually found and corrected the thermal paste issue. The card was stable for another 3-4 weeks and then it started black-screening again, but this time with multi-colored squares drawn randomly across the screen. This happened literally (thankfully) two days before the warranty expired. The strange thing is that I never saw any elevated temperatures on that GPU! It never climbed any higher than the mid-70's.
Best of luck, but your situation does not sound encouraging.
You never know, thanks for the tip, I have a EVGA card as well. So it could be the same problem. Hopefully it will hold on until VEGA arrives. Then I will have a lot of choices I guess