While I’ve been working with the Intel x520 chipset creating virtual functions with ixgbe/ixgbevf I’ve found that guests are getting expected download speeds but aren’t at all getting the upload speeds I would expect. Here’s the results I’m seeing:
Lastly here are the speed test results I’m seeing from the VM when I disable the x520 chipset entirely from the perspective of the guest and instead run the test while sharing the host networking adapter (no SR-IOV NIC - no discrete IPV4 assigned to the guest):
Both the Intel x520 SR-IOV NIC and the host’s integrated motherboard NIC are plugged into our top of rack switch and both the host and the guest have discrete IPV4 addresses assigned.
In order to try to narrow the possible causes I disabled SR-IOV functionality and instead passed the full x520 interface into the guest (fat VFIO passthrough) and the results of the speed test in that scenario were near identical with around 2Mbit/s upload and 750Mbit/s download.
I’m on a Dell R7610 host with the x520 on an 8xPCIe gen 3 slot. I’m using ADOP 10G SFP+ modules shown here:
Here is the /etc/modprobe.d/ixgbe.conf settings I am using (when virtual functions are disabled):
Here is the /etc/modprobe.d/ixgbe.conf settings I am using (when virtual functions are enabled):
I am kind of at a loss for how I could be receiving the expected throughput both is full VFIO passthrough mode and virtual function mode for the download speed but be seeing such a massive bottleneck on the upload speed. Any help would be much appreciated.
Can you run iperf across the LAN and between 2 VMs? I agree you probably have an issue there but speedtest-cli can be all over the place. I’d like to see if there’s a difference when two instances of SRIOV are transferring to each other vs across a wire.
Can you also give a breakdown of the OS’s involved. I see the VM is ubuntu but not sure what the physical host is. I assume virtualizing with qemu/kvm?
Here are the iperf results. I performed this test between two virtual machines addressed by their discrete IPV4 addresses. Both VMs are utilizing SR-IOV networking on the same Intel x520 within the same host.
I just ran another test. This time between guests with discrete IPV4 addresses which are within the same server rack but on entirely separate physical hosts physical hosts (both identical Dell servers running the Intel x520 SR-IOV NIC).
So when VMs are addressing each other from within the same device (via WAN IP) there appears to be no problem at all but if they are addressing each other via WAN IP if they are within different devices then the same problem comes up as was seen with speedtest CLI when using SR-IOV networking.
Sadly I have not yet found a fix for this. I’ve only confirmed that SR-IOV networking between VMs virtualized on the same host is fast but that all networking outside of the host including between hosts within the same cluster is bottlenecked down to speeds below 10Mbits/sec.
The only other step I can think to try is replacing the SFP modules and rerunning similar speed tests. I have some ordered that should arrive this week but until then I’m kind of at a loss for what to try next. If I am able to identify a fix for this issue I’ll post back here.