Dying videocard?

Hi guys, after some trouble having stuttering issues due to DPC latency my linux install started flooding me with messages stating some hardware maybe at fault. These are from my dmesg log:

[ 533.594671] NVRM: Xid (PCI:0000:24:00): 61, 1927(15d4) 00000000 00000000
[10557.043923] NVRM: Xid (PCI:0000:24:00): 61, 1927(15d4) 00000000 00000000
[11008.026024] NVRM: Xid (PCI:0000:24:00): 61, 1927(15d4) 00000000 00000000
[11010.950218] NVRM: Xid (PCI:0000:24:00): 61, 1927(15d4) 00000000 00000000
[11920.797068] NVRM: Xid (PCI:0000:24:00): 61, 1927(15d4) 00000000 00000000

And it keeps adding some more after some time, 1 time I had more than 1500 errors.

Googeling the error message brought me not much information wise, Nvidia says error Xid 61 is an Internal micro-controller breakpoint/warning. Source: https://docs.nvidia.com/deploy/pdf/XID_Errors.pdf

Any of you have these before? Is my GPU dying?

Stating what video card you have and what you're using it for would help.
Put it in some windows environment and runs some tests to see if it crashes, if not it probably isn't because of the gpu.

It's a GTX 670. I had some freezes, black screens and artifacts on Linux. And black screen on Windows as well. But not all the time. Yesterday I was reading some webcomics and poof screen went dark. After reboot still dark and then it ran normally for the rest of the evening. I just finished building my new PC so I cleaned all the older components I wanted to reuse so dust isn't an issue. Searching for this particulary error doesn't bring up any useful information

Make sure you seated your card properly, checked all the cables for connectivity and damage, then make sure you have the right drivers installed.

Also...make sure the power cable is connected properly in your monitor(s).

Did you ever screw around with the thermal paste or pads? Seems like your card is overheating and turning itself off. What are your temps?

I had a GTX780 fail. EVGA played 20 Questions with me until I said the A-word. Just as soon as I told them that I had visual artifacts on the monitor, they told me to box up the GPU and ship it to them.

The card is stock, never overclocked it and never touched the cooling. The temps go up to around 78 degrees celcius. But never higher

I dont think my card falls under garantee anymore :stuck_out_tongue:

I tried everything you suggested, I even tried a different PCIe slot. At first I thought it could be my motherboard beeing that my videocard never had any troubles before. Not in this frequency at least

If it is no longer under warranty, then you have nothing to loose by taking it apart. The reason my GTX780 failed, is because EVGA applied thermal compound to only 2/3 of the GPU die. I assume this caused the card to slowly cook itself to death. Months after I purchased it, it began crashing, when under heavy load, with a black screen. Yes, it functioned flawlessly in that condition, for months! I eventually found and corrected the thermal paste issue. The card was stable for another 3-4 weeks and then it started black-screening again, but this time with multi-colored squares drawn randomly across the screen. This happened literally (thankfully) two days before the warranty expired. The strange thing is that I never saw any elevated temperatures on that GPU! It never climbed any higher than the mid-70's.

Best of luck, but your situation does not sound encouraging.

You never know, thanks for the tip, I have a EVGA card as well. So it could be the same problem. Hopefully it will hold on until VEGA arrives. Then I will have a lot of choices I guess