Server becomes unresponsive, physical reboot required

Hi.

I’m not entirely sure where to start the troubleshooting.

Randomly, my Win Server 2012 hangs.(I assume, can’t reach the network atleast.)
I only remote to this pc, and can’t physically troubleshoot, more than tell people to pull the powercable and plug it back in again.

I can’t see anything in the logs hinting at bluescreens, or other errors.

I’m just not entirely sure where to begin troubleshooting.

%SystemRoot%\Minidump\ does not contain anything, from what i’ve seen.
Event viewer only has Information, and sometimes Warnings(regarding DNS that it is hosting, so shouldn’t be relevant to any crashes or such).

Specs:
Dell Poweredge T20 tower server
I5 4570
12GB ddr3
Radeon 5450
4tb hdd, 500gb boot ssd
default PSU.

The system is locked physically, and i can therefor not access the internals.
I am not in the country where the pc physically is located, so that’s also a no-go.

How would you people go about troubleshooting things?

I’ve tried to uninstall the last softwares that have been deployed on the server, including drivers by AMD for the graphicscard.

I regularly connect to the PC and it is primarily used for software based rendering, and VM hosting in VMWare workstation 12.

When the PC hangs/disconnects it seems to be physically on (fan spin.).
But the connection times out after 20 attempts to connect.
The Virtualized machines also become unresponsive, and refuse to connect to SSH(Debian VM).

Is there any software that can track more specific issues whats going on, tell more exactly when the computer freezes.(like creating a row in an excel file each 30 seconds, just to verify that it’s alive. It can be extracted upon reboot.)
A software, preferably that tells the status of different programs, to show, lets say if there is currently a program that is hogging the cpu to 100%, or RAM to 100%. Or just, something that can hit as to what’s going on.

I’m quite lost, because there’s usually some log files that I can use to troubleshoot. But in this case, I havn’t found any…

Any, and all help would be appreciated.
Much love!
Cutie ^-^

1 Like

If you don’t have a dump file it will be hard. I assume you have looked in the administrative event log which filters all the errors? Does the crash occur around a narrow time band? I am thinking a scheduled task on a VM?

You could monitor the server with an application (trial) like PRTG, which is agentless. It might give you a better insight.

Id run memtest and the built in hardware diagnostics