(Solved) Intel 82599es based NIC flapping issue

I have a Dell R510 running the newest version of OMV with an Intel 82599es based Silicom PE210G2BPI9 2x10Gbps LC/LC MMF NIC on the newest ixgbe driver (downloaded and compiled locally). I also have a Dell R710 running the newest version of Ubuntu LTS 20.04 with the same NIC and the same driver. The NICs are directly connected to one another with brand new LC/LC MMF cables and static private IP addresses.

I am able to raise a link between both NICs on their respective hosts, but no matter what I do I cannot discern why I have a flapping issue:

$ dmesg | tail

[ 4083.171821] ixgbe 0000:03:00.0 enp3s0f0: NIC Link is Down
[ 4083.199845] ixgbe 0000:03:00.0 enp3s0f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 4083.280158] ixgbe 0000:03:00.0 enp3s0f0: NIC Link is Down
[ 4083.303846] ixgbe 0000:03:00.0 enp3s0f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 4085.187773] ixgbe 0000:03:00.0 enp3s0f0: NIC Link is Down
[ 4085.727752] ixgbe 0000:03:00.0 enp3s0f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 4087.203750] ixgbe 0000:03:00.0 enp3s0f0: NIC Link is Down
[ 4087.743725] ixgbe 0000:03:00.0 enp3s0f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 4089.219666] ixgbe 0000:03:00.0 enp3s0f0: NIC Link is Down
[ 4089.759698] ixgbe 0000:03:00.0 enp3s0f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX

Before upgrading the firmware, this was happening on both hosts whenever a link was established.
After upgrading the firmware, the issue only seems to be happening on the OMV host, which is why I’m posting here now.

I have checked transmit/receive and driver/interface errors and there are none.
I have also checked and even replaced the LC/LC MMF cables.
At this point, I’m wondering if there’s some strange configuration race condition or mix-up somewhere from when I was troubleshooting getting these NICs up in the first place (it wasn’t easy getting to this point).

Does anyone have any insight as to what the issue might be or where I can go from here?

FYI - To anyone who finds this in the future:

The NIC I was using on both ends is a Silicom PE210G2BPI9. Since these cards were designed with a bypass mode which is enabled by default, I needed to download Silicom's driver control software (bpctl) to disable this mode. After that, the cards started showing the flapping issue with the repository-installed ixgbe package which was quite old compared to the newest one available directly from Intel. So, I installed the newest driver from Intel, which changed the issue slightly but the flapping persisted. Ultimately, I solved my issue from installing the validated ixgbe driver included in the tarball provided by Silicom, which included the aforementioned driver control software. The version I used to successfully bring up the link was ixgbe-5.9.4ms7.1.


Here are some useful links and a TL;DR:

  1. Retrieve the Silicom driver which matches your NIC, for me it was here: https://www.silicom-usa.com/drivercat/bypass-1/?pname=PE210G2BPI9%20Ethernet%20Bypass
  2. Clear the system of pre-existing ixgbe drivers (assuming you have no other NICs using the driver)
  3. Install and run the bpctl utility to disable bypass for each port & permanently disable them through power cycles
  4. Make and install the ixgbe driver from version 5.9.3ms7.1, which can be found in a subfolder of the driver download directory
  5. Reload/run the ixgbe driver, and your system should see the NICs
  6. Configure networking for each port as normal

Helpful references/tools:

  • https://www.reddit.com/r/homelab/comments/glimrw/proxmox_62_switch_off_nic_bypass_mode_on_riverbed/
  • https://www.reddit.com/r/homelab/comments/9v56c2/wanting_to_pick_up_an_r210ii_for_pfopnsense_also/
  • https://www.reddit.com/r/homelab/comments/dwpy2u/dell_r210_ii_oem_flashing/
  • Ethtool was extremely helpful, familiarize yourself with the man page/help function