Bridge on bond doesn't come up after reboot (using SR-IOV VF's)

Been bashing my head against this for weeks now. If anyone has any suggestions I’d be very grateful!!

I have PVE 8.3 installed on a machine with a dual port SFP+ NIC (X710-DA2). I’ve set up SR-IOV VF’s using these udev rules:

ACTION=="add", SUBSYSTEM=="net", ENV{INTERFACE}=="enp1s0f0np0", ATTR{device/sriov_numvfs}="3"
ACTION=="add", SUBSYSTEM=="net", ENV{INTERFACE}=="enp1s0f1np1", ATTR{device/sriov_numvfs}="3"

Using the PVE GUI, I’ve set up a bridge on a bond on two of the VF’s resulting in the following interfaces file:

auto lo
iface lo inet loopback

auto enp1s0f0v0
iface enp1s0f0v0 inet manual

auto enp1s0f1v0
iface enp1s0f1v0 inet manual

auto bond0
iface bond0 inet manual
  bond-slaves enp1s0f0v0 enp1s0f1v0
  bond-miimon 100
  bond-mode active-backup
  bond-primary enp1s0f0v0

auto vmbr0
iface vmbr0 inet manual
  bridge-ports bond0
  bridge-vlan-aware yes
  bridge-vids 2-4094
  bridge-stp off
  bridge-fd 0

auto vmbr0.10
iface vmbr0.10 inet static
  address 10.7.0.20/24
  gateway 10.7.0.1
source /etc/network/interfaces.d/*

After applying the changes, the network works as expected. After a reboot, though, none of the interfaces come up:

> ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp1s0f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 68:05:ca:9b:d6:84 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether ba:4e:ac:fb:fe:ef brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 2     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
3: enp1s0f1np1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 68:05:ca:9b:d6:85 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether ba:4e:ac:fb:fe:ef brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 2     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
4: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 8c:16:45:92:88:9b brd ff:ff:ff:ff:ff:ff
    altname enp0s31f6
5: enp1s0f0v0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond0 state DOWN mode DEFAULT group default qlen 1000
    link/ether ba:4e:ac:fb:fe:ef brd ff:ff:ff:ff:ff:ff
6: enp1s0f1v0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond0 state DOWN mode DEFAULT group default qlen 1000
    link/ether ba:4e:ac:fb:fe:ef brd ff:ff:ff:ff:ff:ff
7: enp1s0f0v1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 86:62:db:16:bf:ef brd ff:ff:ff:ff:ff:ff
8: enp1s0f1v1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether e6:93:a7:a5:17:b9 brd ff:ff:ff:ff:ff:ff
9: enp1s0f1v2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether de:7d:9a:5c:a5:a3 brd ff:ff:ff:ff:ff:ff
10: enp1s0f0v2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 96:cd:ab:44:2c:57 brd ff:ff:ff:ff:ff:ff
11: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue master vmbr0 state DOWN mode DEFAULT group default qlen 1000
    link/ether ba:4e:ac:fb:fe:ef brd ff:ff:ff:ff:ff:ff
12: vmbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether ba:4e:ac:fb:fe:ef brd ff:ff:ff:ff:ff:ff
13: vmbr0.10@vmbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
    link/ether ba:4e:ac:fb:fe:ef brd ff:ff:ff:ff:ff:ff

I’m not seeing any dmesg errors or even ifupdown2 errors in the debug logs. All the commands run successfully, but the links just aren’t up. The weird thing is, if I give the bond a static IP, then it comes up just fine. Also - if I stop using SR-IOV VF’s it also works. So there’s some interaction, but I have no idea what.

it looks like the miniport driver for the VFs is stepping on itself in your config. what is the reason for using VFIO as apposed to VIRTIO?

Does the physical NIC have a VFIO capable driver installed on the host?

also, not part of this probably, but ‘static’ is a legacy term and should just be ‘manual’

miniport is a windows thing, no? I’m using VFIO because not all the VF’s will be passed through to VMs.

The physical nics use i40e, and the VFs are using iavf, which is the associated VF driver so I think that should all be fine. And everything works as expected if I set it all up after boot.

I thought maybe something isn’t getting loaded fast enough when the VFs are set up, but even if I add sleeps before the ifup, it still doesn’t work. That and the bond having an IP seems to fix it which would suggest against a race condition.

Yeah the proxmox GUI just uses static for some reason

Your point about the driver made me think to check dmesg for any driver messages. There definitely seems to be some difference between how ifup is handling bond0 and bond1 because of the bridge (vmbr0). You can see the *v1 VF’s come up right away, but the *v0 VF’s don’t…

[    6.959146] bond0: (slave enp1s0f0v0): Enslaving as a backup interface with a down link
[    6.962923] bond0: (slave enp1s0f1v0): Enslaving as a backup interface with a down link
[    6.978995] vmbr0: port 1(bond0) entered blocking state
[    6.979000] vmbr0: port 1(bond0) entered disabled state
[    6.979007] bond0: entered allmulticast mode
[    7.243419] 8021q: 802.1Q VLAN Support v1.8
[    7.243429] 8021q: adding VLAN 0 to HW filter on device enp1s0f0v0
[    7.243503] 8021q: adding VLAN 0 to HW filter on device enp1s0f1v0
[    7.243530] 8021q: adding VLAN 0 to HW filter on device bond0
[    7.310869] 8021q: adding VLAN 0 to HW filter on device enp1s0f0v1
[    7.310920] bond1: (slave enp1s0f0v1): Enslaving as a backup interface with a down link
[    7.314099] 8021q: adding VLAN 0 to HW filter on device enp1s0f1v1
[    7.314151] bond1: (slave enp1s0f1v1): Enslaving as a backup interface with a down link
[    7.317212] 8021q: adding VLAN 0 to HW filter on device bond1
[    7.395201] iavf 0000:02:02.1 enp1s0f0v1: NIC Link is Up Speed is 10 Gbps Full Duplex
[    7.399144] iavf 0000:02:0a.1 enp1s0f1v1: NIC Link is Up Speed is 10 Gbps Full Duplex
[    7.423834] bond1: (slave enp1s0f0v1): link status definitely up, 10000 Mbps full duplex
[    7.423845] bond1: (slave enp1s0f1v1): link status definitely up, 10000 Mbps full duplex
[    7.423847] bond1: (slave enp1s0f1v1): making interface the new active one
[    7.423870] bond1: active interface up!

also found some messages when I started playing with ifdown/ifup after boot…

[  490.406628] bond0: left allmulticast mode
[  490.406857] vmbr0: port 1(bond0) entered disabled state
[  490.700789] bond0 (unregistering): (slave enp1s0f0v0): Releasing backup interface
[  490.708664] bond0 (unregistering): (slave enp1s0f1v0): Releasing backup interface
[  490.722696] i40e 0000:01:00.1: Cannot add more MAC addresses, VF is not trusted, switch the VF to trusted to add more functionality
[  490.722776] iavf 0000:02:0a.0: Failed to add MAC filter, error IAVF_ERR_NVM
[  490.722817] bond0 (unregistering): Released all slaves
[  490.749763] iavf 0000:02:02.0: Too many delete VLAN changes in one request
[  490.764703] iavf 0000:02:0a.0: Too many delete VLAN changes in one request
[  490.771717] iavf 0000:02:02.0: Too many delete VLAN changes in one request
[  490.785729] iavf 0000:02:0a.0: Too many delete VLAN changes in one request
[  490.864622] bond1 (unregistering): (slave enp1s0f0v1): Releasing backup interface
[  490.881551] bond1 (unregistering): (slave enp1s0f1v1): Releasing backup interface
[  490.889956] bond1 (unregistering): Released all slaves
[  498.397748] 8021q: adding VLAN 0 to HW filter on device enp1s0f0v0
[  498.397805] bond0: (slave enp1s0f0v0): Enslaving as a backup interface with a down link
[  498.400658] 8021q: adding VLAN 0 to HW filter on device enp1s0f1v0
[  498.400713] bond0: (slave enp1s0f1v0): Enslaving as a backup interface with a down link
[  498.418565] iavf 0000:02:02.0: Too many add VLAN changes in one request
[  498.418713] i40e 0000:01:00.0: VF is not trusted, switch the VF to trusted to add more VLAN addresses
[  498.422622] iavf 0000:02:0a.0: Too many add VLAN changes in one request
[  498.422827] i40e 0000:01:00.1: VF is not trusted, switch the VF to trusted to add more VLAN addresses
[  498.439542] iavf 0000:02:02.0: Too many add VLAN changes in one request
[  498.439675] i40e 0000:01:00.0: VF is not trusted, switch the VF to trusted to add more VLAN addresses
[  498.443539] iavf 0000:02:0a.0: Too many add VLAN changes in one request
[  498.443693] i40e 0000:01:00.1: VF is not trusted, switch the VF to trusted to add more VLAN addresses
[  498.448383] vmbr0: port 1(bond0) entered blocking state
[  498.448387] vmbr0: port 1(bond0) entered disabled state
[  498.448409] bond0: entered allmulticast mode
[  498.460663] i40e 0000:01:00.0: VF is not trusted, switch the VF to trusted to add more VLAN addresses
[  498.465627] i40e 0000:01:00.1: VF is not trusted, switch the VF to trusted to add more VLAN addresses
[  498.502749] iavf 0000:02:02.0 enp1s0f0v0: NIC Link is Up Speed is 10 Gbps Full Duplex
[  498.508713] iavf 0000:02:0a.0 enp1s0f1v0: NIC Link is Up Speed is 10 Gbps Full Duplex
[  498.682651] 8021q: adding VLAN 0 to HW filter on device bond0
[  498.683446] bond0: (slave enp1s0f0v0): link status definitely up, 10000 Mbps full duplex
[  498.683454] bond0: (slave enp1s0f1v0): link status definitely up, 10000 Mbps full duplex
[  498.683456] bond0: (slave enp1s0f0v0): making interface the new active one
[  498.683458] iavf 0000:02:02.0 enp1s0f0v0: entered allmulticast mode

yes, i did assume you are handing VFs to mixed VMs… because i can not come up with any other reason to use a VF?

yeah i was going to suggest that to see what happens.

also do you have a PVID assigned to the bridge? it would be another thing to test.

also, if you have time, can you explain what you are doing with the VFs? more for my sanity, than troubleshooting.

PVID didn’t fix it unfortunately. I tried removing the vmbr0.10 too, and that didn’t matter. So maybe not VLAN related. I’ll try updating the drivers next, I guess. I’m getting suspicious of the MAC addressing… maybe having the driver pick it is causing problems. I also noticed the third VF doesn’t have any MAC…

I’m using VF’s mainly to set up network redundancy. My switches don’t support LAGGs, so I’m creating two active-backup bonds on the VF’s. One uses NIC 0 active, NIC 1 backup, the other uses NIC 1 active, NIC 0 backup. It means you split bandwidth when both are up, but that’s fine for me since I want to divide my network in two. If one switch goes down, both share the other temporarily, avoiding any downtime.

You can’t do this on the PF’s because of how the linux bond driver sets up mac addresses. You can do the same thing with bridges instead of VF’s, but I figured VF’s would perform better. That and something about bridge on bond on bridges felt worse than bridge on bond on VFs.

Insane? Maybe. But it does work :slight_smile:

1 Like

Yeah that was going to be my suggestiin. I have gotten weird things to work fine on bridges and basic things to not work in VFIO. I almost universally just build bridges anymore.