ProxMox VM won't start if PCI passthrough is enabled with more than 96GB memory

Razor_Blade · January 4, 2020, 11:26pm

In short, I’m trying to create a FreeNAS VM in ProxMox that has PCI passthrough to an HBA and has 128GB of memory.

I’ve been exploring ProxMox for a while now but have been running into problems if the VM has a PCI passthrough device and more than 96GB of RAM assigned. The VM will start and run fine without the PCI passthrough device or if I lower the memory assignment. One suggestion I was able to find was to change the “hugepages” value to 2 which allowed the VM to start but made it completely unstable. I don’t know enough about the intricacies of Debian to try to find out why and at this point, I’m all out of ideas…many forum threads I’ve seen are over 2 years old. Currently I’m using ESXI which doesn’t have the issue but I’m really hoping to move away from VMware if I can…

Any suggestions?

Or… if someone wants to suggest a different NAS OS, that’s fine. Just would like to keep ZFS. Also I use Samba, ISCSI, and run Plex on FreeNAS currently so I would need to figure that out if I switch. Running FreeNAS bare metal isn’t an option as I still want an all in one box (that can run PFsense, Windows, and Ubuntu)…just need the dang hypervisor to cooperate…

Thank you in advance!

nx2l · January 4, 2020, 11:29pm

Numa or uma?

Razor_Blade · January 4, 2020, 11:31pm

NUMA I have a dual CPU system.

The NUMA box is checked under CPU as well.

nx2l · January 4, 2020, 11:34pm

Uncheck and try again?

Razor_Blade · January 4, 2020, 11:47pm

Same error. I’ve tried several other things like different BIOS, CPU cores and sockets quantity, and memory ballooning enabled or disabled. keep getting the “failed: got timeout” error…

nx2l · January 4, 2020, 11:56pm

https://forum.proxmox.com/threads/vm-start-timeout-pci-pass-through-related.28876/

Razor_Blade · January 5, 2020, 12:03am

That is the thread I ran across and tried changing hugepages to 2 and turn off ballooning on memory. It made the VM very unstable. I was able to get FreeNAS to install (very very slowly), but I was not able to get it to boot. Could be possible to have been the build of FreeNAS 11 U7…but unable to find any newer information about it. Most information about this topic I’ve found seems to be dated 2016 or 2017.

EDIT: actually I lied… this was the most recent topic I saw which is similar to the topic posted above…

Again…with the unstable problem…yeah…

Trooper_ish · January 5, 2020, 11:24am

Just for completeness sake, did you try and launch it from console, and if no immediate error, leave it 20 mins in case slowly checking/allocating stuff?

I presume you did, but it worked for one of the guys

Razor_Blade · January 5, 2020, 2:05pm

Yup… in fact the VM sat while I was searching for answers for almost 2 hours. I have a feeling that it could be the way ProxMox or KVM handles NUMA or just resources in general while passing through hardware… Though each CPU can access 128GB of memory there is no way to know what ProxMox will try to allocate.

After messing with it some more, I did notice in the summery screen the VM appears to start (CPU activity) and the memory bar starts filling up. It (the memory bar) only gets about 3/4 of the way before the timeout…So maybe it is an issue where ProxMox simply times out? I can not find a parameter anywhere for timeout on startup, only shutdown. Though I don’t know what the “hugepages” parameter does, that was the only thing that allowed the VM to start. Maybe it somehow speeds up the allocation of memory but caused massive instability for me. Maybe it is an issue if using multiple CPUs or something.

It just sucks that I finally was able to get passthough working I thought it was the last piece of the puzzle and could green light the ProxMox switch over but figures I ran into another issue almost immediately

zanginator · January 5, 2020, 4:10pm

Just a quick one, just to get a more complete picture.

How much RAM is there in the system in total? I am going to guess 192GB???

EDIT: Scratch that, just noticed you mentioned each CPU has 128GB

Also, latest Proxmox and PVE kernel?

Razor_Blade · January 5, 2020, 4:59pm

System is a Dell R720 XD LFF, 256GB DDR3 (sixteen 16GB modules), Two E5-2690 V2 (Ivy bridge). Last I messed with ProxMox was version 5.4-1 so I downloaded the latest I could find (6.1-1) before starting. I did change the repository to the community under /etc/apt/sources.list as instructed in the wiki and updated using command line (not web GUI) so I believe it is up to date as much as my Debian knowledge could take me…

If I missed something please let me know. I know enough about Linux to be dangerous (basic CLI file system manipulation, create or change user or permissions, and updating the system) but still require tutorials for just about everything else.

Thank you in advance

zanginator · January 5, 2020, 5:21pm

Okay cool stuff, funny enough I have a similar test system by the looks of it except I have E5-2680 v2.

What hardware settings are you running on the VM? What is the BIOS and Machine type set as?

What flag’s do you have set for the PCI passthrough?

Razor_Blade · January 5, 2020, 7:55pm

I have tried combinations of
SunBIOS and the OVMF
q35 and i440fx
qemu agent enabled, disabled
NUMA enabled or disabled, less cores with more sockets, more cores with one socket, different “type” selections
Memory ballooning on or off, backed off memory until about 96GB where it will boot.

PCI device with PCIe flag on or off, “All functions” checked or unchecked, ROM-Bar checked or unchecked.

Some combinations would complain about a different issue. I forget the exact errors but they were more a “requirement not met” type of thing. For instance if you had the PCIe box checked in the PCI passthrough menu and q35 wasn’t selected as machine type.

The VM will start with as much memory as I want no problem if the PCI passthrough device is removed… Which I found a bit strange. I remember passing through hardware on ESXI it doesn’t support thin provisioning on memory, you have to select the “reserve memory” option. Maybe there is something preventing the system from reserving that memory or something. I tried slightly smaller memory values until I was able to get it to start. I was able to get it to boot with 98GB once but after stopping the VM and starting it again, it failed to start. Reducing it more and more, 96GB seemed to be the magic number to get it to reliably start every time.

zanginator · January 5, 2020, 9:05pm

Well this is interesting, just to let you know. I am running into exactly the same issue.

As I step about 90GB of RAM, the VM says nope and hangs at boot. It gradually swallows more and more RAM until it just hangs and KVM kills it.

It’s not as if I am exceeding available RAM in either NUMA node (you can see how much is available in each NUMA domain with the following in terminal:

numactl --hardware

Wonder if adding the PCI device is making KVM exceed some form of limit and it just kills it.

Will continue to investigate.

zanginator · January 5, 2020, 9:15pm

Quick question, when you tried “hugepages”. Did you just add it the VM config and nothing else?

Razor_Blade · January 6, 2020, 11:02pm

I used the following in CLI

qm set 101 --hugepages 2

where 101 is the VM’s ID. It seemed to take it no problem and the other test VM worked fine. Thank you for the help, I do really appreciate it!

zanginator · January 7, 2020, 6:05pm

Try removing the hugepaging option and starting it slightly different.

You’ll have to manually remove the line ‘hugepages’ from the VM config in /etc/pve/qemu-server/VMID.conf

Then try starting the VM with the following (from terminal)
qm showcmd VMID | bash

This should bypass the startup timeout.

Razor_Blade · January 8, 2020, 12:10am

Wow that worked! I believe you found the issue!

Only question I have is…and I do not mean this as being ungrateful…is there something I can fix or would I have to start it with CLI everytime? I’m just thinking more for recovery if a system restarted for any reason.

On a side note, I’ve been reading up on ZFS (OpenZFS how to use and manage with CLI) and thinking possibly just making ProxMox manage the array and maybe a Linux container to manage the sharing for a plan B if I couldn’t get it to work but I’ve been using FreeNAS for several years is why I’m so attached to it.

zanginator · January 8, 2020, 1:05am

Glad that worked. So to clear up whats going on here.

qm (Qemu/KVM Virtual Machine Manager) is a shortcut to launching KVM with a pile of options from a config file.

qm showcmd outputs the command which is used to start a particular machine. It’s mainly intended for debug purposes.
Then with the | bash we’re saying “hey, pipe that output into bash and it will then be executed.”

When you run qm showcmd it literally is outputting what qm start would run. But instead when piped to bash it takes the startup process away from qm as we’ve called KVM ourselves.

So in my high level looking at this without diving into qm source, qm is simply not waiting long enough for KVM to kick into gear with larger machines like this (especially when there are HBA’s involved).
EDIT: It actually is more likely to be Proxmox’s API function as qm calls the API to perform any function on virtual machines.

The problem we’ve got here in this case is that Proxmox always executes qm start when you hit ‘start’ in the GUI.

So with that there are three solutions:

Find what makes qm trip up and see what can be done.
Change how Proxmox starts machines.
Add a cronjob @reboot to run ‘/usr/sbin/qm showcmd VMID | bash’

Adding the cronjob will do for now. As it will ‘fix’ the problem for now and start the machine at boot. (Maybe worth putting it in a script with slight delay to make sure Proxmox has fully started)

I think i’ll do a little more digging and see if this is a problem with the current Proxmox builds or if the problem lies in qm.

Razor_Blade · January 8, 2020, 1:36am

I may try the cron job option. Not sure how ProxMox handles the “start at boot” when you check the box but maybe making the script to start the other 2 VMs as well would be best if I go that route.