Return to Level1Techs.com

Down the rabbit hole/VMs meet Inception: Nested Virtualization performance for android emulator (BlueStacks/NOX) running under windows running under ESXI 6.7U3

TLDR; Getting android emulator to work in a windows VM that is running off of an ESXi 6.7 host

Hey Everybody!

Back from my adventure’s virtualizing/consolidating my entire HomeLAB under ESXi (wanted to use proxmox, didn’t work, have around 6 hours of footage trying to figure it out if anyones wants to see a young guy rip his own hair out at 4 in the morning)

Part of the stated goals for this consolidations was to create a windows VM with my GPU (RX5600xt) to run all of my steam games and windows only Cad software needed for school amongst other uses.

As a personal project I’m trying to learn more about Nested virtualization, with the end goal being to virtualize a mobile game in the VM^2.

The Hardware I’m running off of started as an HP ml350 G6, which now has every single PCIE slot, ram slot, USB ports etc. filled to the brim with things I’m learning about. processors are a pair of hexacore Westmere Xeon’s X5670 @ 2.93GHz. VT-X/D/EPT is enabled in bios.

The windows VM has virtualization working (since the GPU works well minus the navi reset bug)

One of the issues I ran into and solved so far is that, ESXI doesn’t allow you to natively pass through a PCI-e card while ALSO passing through what they call “Expose hardware assisted virtualization to the guest OS”

With a tweak to the .VMX config file (add in vhv.enable = "TRUE" vhv.allow Passthru = "TRUE")

VT-D/X is passed through to the windows VM! Here’s where my issue begins. Although both of my Android emulators of choice (BlueStacks and NOX) will launch, the performance is abysmal.

I’ve allocated both emulators (not running at the same time) to be able to use 6 of the 8 threads the VM’s have, and use up to 16GB of ram (the VM has 32). Task manager in windows is showing around 70% CPU utilization, which roughly correlates to 6/8 threads, with the GPU hitting around 15% max load. Yet I feel as if the reason it’s leaking so much is because it’s not utilizing the VTX available.

The threads the VM are running on are all pinned to the same CPU, and the ram is assigned from the channels on that same CPU, so data is not having to pass through the Northbridge or jump from one processor to the other.

Things I’ve tried include switching the graphical engine between OpenGL and DirectX, changing the application binary settings within the emulators to arm (some apps bundle x86 code, others don’t it’s very very messy)

Also attempted to enable ASTC textures to see if it would allow for performance counters; didn’t help at all.

Changing the Graphics engine from Compatibility to performance mode also did not yield any measurable differences.

Resolution did not seem to make any effect on performance.

Typically Task manager would display if VTX is detected, however because it detects itself in a virtualized environment, it does not display one way or another. A typical work around would be to go to the command prompt and execute system info. Unfortunately this returns “A hypervisor has been detected…(VTD/X info)…Will not be displayed”

So, after all of this, here’s what I’d like some help with:

How could I confirm that the VM is indeed able to use virtualization?

In the case where it isn’t getting it, how would I go about using it

In the case where it is getting it, how can I test to see if the application is indeed using those extensions?

The final case which I really hope isn’t true us if VTDX is working, and the application IS using it, but the performance simply isn’t there. I doubt this is the case, as even loading the google play store takes on the order of tens of minutes right now.

One of the things I’ve observed is that, running the exact same emulators and android applications on a mobile i7 from the sandy bridge days (do same lithography, different architecture, lower clock speeds, less powerful gpu, less ram) is that it will run flawlessly at 60fps 1440p for a long time before heat eventually forces clocks lower.

Thanks everyone! wishing you all good health and good spirits

-That Canuck Engineering student

Would love some advice from anyone that could!

Ive done nested virtualization in proxmox to run kvm inside a VM, but nothing like what you’re trying.

Good luck

Yeah, II’m 90% sure its the VM not properly passing down HW virtualization :confused:

When you tried to do nested with proxmox… did you follow https://pve.proxmox.com/wiki/Nested_Virtualization

thats how i got it working for me.

I notice the cpu model you are using is kinda old, maybe performance is what it is.

Win10 wasnt meant to run on that old of hardware, so maybe that isnt helping the situation.

Maybe putting windows on the bare metal and trying the android emulator to see how performance is like compared to the nested when you had it going.

If its the same… then its likely just the older hardware?

Yeah, ended up not doing proxmox for my set-up due to problems with VFIO and RMRR checks related to my hardware. this thread details the steps that I attempted to no avail. considering trying again if I can’t get it working under ESXi. https://forum.proxmox.com/threads/compile-proxmox-ve-with-patched-intel-iommu-driver-to-remove-rmrr-check.36374/

My thinking on why it shouldn’t be a performance thing is that my laptop (2014 retina MacBook) running at 2.1 due to thermals on 6 threads is capable of 1080p at 60 fps using iris graphics. Even with the virtualization handicap I would have thought that a system with 8 threads running at 50% higher clocks with twice the memory and a much more powerful gpu would at-least be able to meet performance parity. (not to mention that it’s loading of off a modern NVMe drive)

Any ideas on how to test if VT-x is actually being passed through correctly? I’m at a loss on how to make sure it’s working, being that windows doesn’t return true or false right now