Return to Level1Techs.com

Ubuntu Server - VM Freeze's after a few hours of Xorg session of Tesla M40's

I posted a thread a few days ago trying to get Blender to run with my Tesla M40’s. I have 4 cloned virtual machines with the latest Nvidia Tesla drivers. I start the VM’s and it seems about 1-3 hours after they are started they freeze and all the SSH sessions are closed. Proxmox is the host for all the machines with 1 Tesla M40 passed through to the VM.

Server Specs (2x Dell Poweredge R720):
CPU’s: 2 x E5-2697 v2 2.7Ghz
Memory: 128GB
HDD: Dual 10k 1.8TB Seagate enterprise drives (Raid 1)
Operating system
-Proxmox
-VM 1 (Ubuntu Server)
-VM 2 (Ubuntu Server)

I’m not sure what log file will be useful here so I will start with a few of them this is the list of them I have at the moment in the /var/log

Xorg1.conf:

Dmesg0:

Syslog: (Right around the crash at (9:19):

Update servers have been running for more than 12 hours now after I disabled Memory Ballooning which was found in the syslog right before the server stopped responding with Virt0 Baloon error.

1 Like