Huge disparity in upload/download speed with Intel x520 SR-IOV NIC

While I’ve been working with the Intel x520 chipset creating virtual functions with ixgbe/ixgbevf I’ve found that guests are getting expected download speeds but aren’t at all getting the upload speeds I would expect. Here’s the results I’m seeing:

Here’s a comparison to the performance I’m seeing on the host system using the motherboard NIC:

Lastly here are the speed test results I’m seeing from the VM when I disable the x520 chipset entirely from the perspective of the guest and instead run the test while sharing the host networking adapter (no SR-IOV NIC - no discrete IPV4 assigned to the guest):

Both the Intel x520 SR-IOV NIC and the host’s integrated motherboard NIC are plugged into our top of rack switch and both the host and the guest have discrete IPV4 addresses assigned.

In order to try to narrow the possible causes I disabled SR-IOV functionality and instead passed the full x520 interface into the guest (fat VFIO passthrough) and the results of the speed test in that scenario were near identical with around 2Mbit/s upload and 750Mbit/s download.

I’m on a Dell R7610 host with the x520 on an 8xPCIe gen 3 slot. I’m using ADOP 10G SFP+ modules shown here:

Here is the /etc/modprobe.d/ixgbe.conf settings I am using (when virtual functions are disabled):
image

Here is the /etc/modprobe.d/ixgbe.conf settings I am using (when virtual functions are enabled):
image

I am kind of at a loss for how I could be receiving the expected throughput both is full VFIO passthrough mode and virtual function mode for the download speed but be seeing such a massive bottleneck on the upload speed. Any help would be much appreciated.

1 Like

Can you run iperf across the LAN and between 2 VMs? I agree you probably have an issue there but speedtest-cli can be all over the place. I’d like to see if there’s a difference when two instances of SRIOV are transferring to each other vs across a wire.

Can you also give a breakdown of the OS’s involved. I see the VM is ubuntu but not sure what the physical host is. I assume virtualizing with qemu/kvm?

1 Like

The host is Ubuntu 20.04 and the guest is also Ubuntu 20.04. I’ve run this same test in Windows 10 LTSC using the x520 virtual function drivers and seen the same results.

Will get an iperf between the VMs in a moment.

1 Like

Here are the iperf results. I performed this test between two virtual machines addressed by their discrete IPV4 addresses. Both VMs are utilizing SR-IOV networking on the same Intel x520 within the same host.

iperf server:

iperf client:

I just ran another test. This time between guests with discrete IPV4 addresses which are within the same server rack but on entirely separate physical hosts physical hosts (both identical Dell servers running the Intel x520 SR-IOV NIC).

iperf server:

iperf client:

So when VMs are addressing each other from within the same device (via WAN IP) there appears to be no problem at all but if they are addressing each other via WAN IP if they are within different devices then the same problem comes up as was seen with speedtest CLI when using SR-IOV networking.

I am at a total loss for what this could be.

thanks for the awesome information.

Sadly I have not yet found a fix for this. I’ve only confirmed that SR-IOV networking between VMs virtualized on the same host is fast but that all networking outside of the host including between hosts within the same cluster is bottlenecked down to speeds below 10Mbits/sec.

The only other step I can think to try is replacing the SFP modules and rerunning similar speed tests. I have some ordered that should arrive this week but until then I’m kind of at a loss for what to try next. If I am able to identify a fix for this issue I’ll post back here.

1 Like

I know this is a lot of work, but could you try ESXi, Hyper-V or some other platform to see if it’s a bug in the driver?