UnFuck this ... whatever ... I don't know [solved]

Yup.. I am one of the resident EE's on the forum.. one of the few lol so if you have a serious electrical question @ me into your post and ill generally respond if I am around :D

I will warn you sometimes I can be a bit abrasive but given my career thats expected :D dont take it personally

2 Likes

Dude I would see if you could RMA the card. I might be totally off the wall or whatever but the torrent client I use has a bug where it will hard lock waiting for some driver/hardware and is impossible to kill. The only way to end the process is to shut off the computer. It's possible there is something wrong with the card and the driver is locking because the hardware doesn't respond/respond correctly.

"This has nothing to do with W7, this is true for all Windows NT based OS versions. When you kill a process, the user-mode part of the process is thrown away. But the kernel-mode part can't go away until all drivers are finished with the thread(s) that belong to that process.
For example, if a thread is executing an I/O operation, the kernel signals to the driver responsible for the I/O that the operation should be canceled. If the driver is well-behaved, it cleans up the bookkeeping for the incomplete I/O, cleans up the thread state and releases the thread.
If the driver is not well-behaved (or if the hardware that the driver is managing is acting up), it may take forever for it to clean up the incomplete I/O. During that time, the driver holds that thread (and therefore the process that the thread belongs to) hostage.
So you probably have a badly behaving driver on your system that exhibits the problems you describe, this is not surprising giving the beta state of some of the drivers (display, network) running on W7." - willy denoyette: https://social.technet.microsoft.com/Forums/windows/en-US/598fe2b4-844d-412d-b195-5fa53dc62661/end-process-end-task-on-hung-not-responding-programs-does-nothing?forum=w7itproperf

Just finished a short test to look at temperatures. CPU never exceeds 65°C and GPU never exceeds 75°C. So no problem there.

I guess it makes sense to list the specs of my system now. Maybe I'm completely blind for something totally obvious.

Intel Xeon E5 1650 v3
ASRock X99 WS
4 x 16GB Kingston registered ECC DDR4 2133
Sapphire R9 Fury Nitro
Intel X540 T2 dual port 10GBit
Intel series 730 SATA SSD 240GB (main OS)
SanDisk SSD plus 960GB (games)
Hyper X Fury M.2 240GB on the PCIe adapter card (scratch)
and a few small corsair SSDs to play around with stuff
Asus BW-16D1HT blu ray writer
Seasonic Platinum 1050 modded with Noctua NF-S12A PWM running 50% (750rpm)
6 x Noctua NF-A14 ULN on the fan controller of the...
Nanoxia Deep Silence 5, removed all HDD cages for airflow
... and gremlins I guess

Did you run any ridiculous load test like Furmark for a few hours ? That tends to find out if a card is knackered or not.

Did you apply those dpm power and performance states?

here is some commands I used to sort my problem back when I had it. Not tested with PRO driver so I can't confirm that but should lead you on right path.

No, I wanted to try a different windows driver first and so far 15.11 looks stable. I ran firestrike a few times and right now we are at 45 min. in furmark. It might be something in all the new drivers that is causing this, at least that would be the cheapest outcome for me....
I'll just let it run for a while longer and after that I'll try to play Firewatch.

I thought this was able getting Ubuntu 16.04 crashes fixed with your 390x... now we are using windows? ok whatever.

OK, first off I have a fury, not a 390x. Also I am running different OS and getting the same freeze. So I have to check my hardware and I wanted also to check older drivers. On linux that would mean going back to catalyst. So I am trying windows first.

So far so goo?

Ok fair enough, but for consistent freezes like that all you need to do to test is set the fan to 100% and see if that resolves it or allows you to play longer. IF that is the case then you have your answer as to what the problem is.

PS. sudo echo 255 > /sys/class/drm/card0/device/hwmon/hwmon0/pwm1 is the fan control set to 100% under Linux. (again not sure if PRO driver is identical, but should be)

OK, Firewatch crashed pretty fast. At least I can now safely say it is NOT a driver issue and it is NOT a linux issue.

Furmark was running completely fine for more than 90 minutes. But Furmark pushes the card only to 980-1000MHz. In game it goes up two 1050.

The second difference is the usage of VRAM. Games of course use all the VRAM they can get, Furmark doesn't use any. So maybe there is some errors in the HBM or the memory controller or so. And that would fall in line with some freezes on the linux system while just watching video on youtube.


Now I'm running Furmark and the benchmark from CPU-Z just to draw a ton of power. If that holds up for some time I don't think the PSU can be blamed.

That went without problems, so for over an hour I drew more power from the wall than in any game. PSU seems unlikely.

Next I clocked down the GPU to around 660MHz and ran some Firewatch. Eight minutes in and boom, freeze. I'm gonna repeat this a few more times with different stuff but it looks like a problem on the VRAM.

Never had a problem like this. So if anyone has suggestions for testing or a different theory...

See this is why you try to be more conservative when naming your threads :P

Hey, it got attention, right? :P

If you're going tobuse a back port of 4.2, you might as well use 4.5

Uhm, a lot has changed since that post. ;)

You didn't amend the OP, so new visitors still think your raging at ubuntu. And yes the title worked.

I already put it in the Hardware/GPU category..... I'm not writing everything again. :P

I just had a freeze on an RX 480.
And after that I got a freeze on my old GTX760.
GPU is out the window.

… I’m getting so sick of this shit.

At this point my best bet is on the board or memory. But memory also seems unlikely because it passed memtest86 multiple times and linux didn’t say a word either.

Got it to freeze with handbrake....

@Logan, I think you are using the same board. (ASRock X99 WS) Do you have any issues with it?

I think I'm done with this. This is by far not the first problem I had with this system and I already switched out CPU and the ASRock board, I tried different GPUs (green and red), I tried a different PSU, I ran memtest86 for quite some time... I would probably have to look for a different board now anyway. So I take this as a chance to switch the platform. More 2670s for me please.

I got it.

The ASRock board has a UEFI option to enable max turbo clock on all cores.
This option was OFF but it was still doing it. I reflashed the UEFI and now the setting works correctly.

What a nightmare... after all the hardware is ok so .. that's good.
Despite these good news I am still switching to dual 2670s in my main rig.
I might be showing that in another thread.

Thanks everyone for suggestions. You helped me a lot to get to the core of all this.

This thread can be closed now
........ finally.

2 Likes