Help with Mellanox Connectx-4 Lx NIC

Hi all and thank you reading this , and my problem is that I am upgrading froma 10 Gb network which is all CAT 8/RJ45 and 10 Gb NIC as well, has worked for me with out any problems over the last few years. I have built up a kind a Home LAb system and very intrested in Networking, I have some knowledge but not much. MY Home Lab is all Ubiquiti brand only.
Router is Unifi Dream Machine SE.
2 x Switchs 1x 10GB USW-Aggregation the othe USW-Aggregation Pro 25 Gb. I have removed the 25 GB switch WHICH may have a Fault with it and may need returning. The Fault simply put if one pushes one of the SFP28 connections gently (Port 30) and I mean Gently it goes out and the 25 Gb connection breaks the White light goes out and loose the connection.
The other 3 ports are rock solid BUT due to this fault being found After 2 weeks of tests speed transfer’s SMB FTP server, Win 10 Pro using Gen4x4 NVME dives as a test for a hopeful storage medium away from HHD. My connections speeds where around 750 Mb/s and Never exceeded 1GB while testing.
Some times during the test at 25 GB speeds in any of the 4 Ports is even flatlined for about 10 sec’s. Different files sizes where tested 90 Gb to the Lowest. Films and Photos etc.
Sorry getting off Topic I bought 2 x Mellanox ConnectX-4 Lx NIC from a dealer in the UK. I installed the Drivers off the Nvidia site. There are 2 types to choose I tried both. The NIC’s where NOT DELL or HP branded Cards. The drivers showed they where installed, I have enclosed 2 very detailed info taken from one of the Cards I tested on 2 PC’s. They are screen grabbs and sorted in Photoshop and are basic installed drivers details for you to try and help.
I have had so many problems trying to get these NIC@S to work. I have had 15 reinstalls on each pc trying to test these cards but they will not show up on NETWORK task bar. I get the pc’s WORKSTATION and the Other pc named Backup Media PC in the Media Devices that is below the Network. When I go back to the 10 Gb RJ45 connection and I use the Adapter for that all the same name pc’s appear in the Network Section. I cannot get these to appear when trying to use the 25Gb Mellanox NIC’s.
I have checked Device Manager after installing the drivers It says the Card is Working Properly, I can Ping both PC’s and they respond with the proper test. I can download at my max speed through these cards and bouuse the net etc, come to try and set up a Network with these 2 cards I hit a brick wall.
Even coming from a full clean install I have tried just the Windows 10 Pro install with basic drivers not luck, then install the GPU drivers and AMD Chipset drivers. Then turn my attention to getting the Network set up. At this stage Windows installs a driver for the Connectx 4 from 2018 and it does allow Internet access. When i just change just one thing or new driver install I always reset the PC.
I have found now that every time I try booting up Windows MEDIA STREAMING OPTIONS is now turned off and I turn them on and try to get the Network icon to appear its Nada, no good not happening and I reboot and after the reboot MEDIA STREAMING OPTIONS is off again.
I have to turn on Network Discovery manually every time and with each reboot its turned its self off again. BUT when I got back to RJ45 10Gb nic’s after this happening its works a treat I have full Network access to all of my PC’s. I am now down to testing 1 card in one machine. Tried the 2 cards in 2 different machine, even puting them in PCIEx4 when only Pic3x8 is needed. Still nothing.
I have tried the Function Discovery Provider Host tip and others and now I am just out of ideas and what to do.
Do I have 2 x faulty Mellanox-4 Lx Cards.
Is there something I need to turn on in the ADVANCED TAB of these nic’s that I am missing. This area I am clueless and on that point please find the detailed information from the DETAIL and ADVANCED Tabs.
If you need any more info please say here. I have spent over £2500 on getting this set up and GOD I am sick at present with the problems it has caused me and trying to sort this out.

Thank you for your time and reading this Here is the detailed info.

again thank you for any help
Gerard


So if I am understanding your post right, you cannot get the NICs to show up under network connections as a valid adapter in this spot?

What is the exact model number you bought? Are you sure they are even Ethernet NICs? Or did you buy Infiniband adapters, which dont normally show up by default in that network connections spot or even pass IP packets by default

Yes sir that is correct, I bought 2x Connectx-4 Lx NIC for 25 Gb/s file transfer from my Workstation to 2 PC’s with 50 TB HDD Space. PC1 is the Normal Media Server and PC 2 is the Backup of PC1, I will be adding a 3rd Back up PC3 with Hopefully faster Storage Speeds. Some sort of Flash Storage. Anything better than HHD which I have at present.

I had Ordered a 3rd Connectx-5 this time NIC, thinking that the 2 Connectx-4 where faulty or I was sold a fake. I have been testing the new NIC and the same results are happening.

I was testing on Win 10 Pro and since I had posted the request here at L1, I have been testing again, reinstalling but with Win 11 Pro over the last few days and funny results. PC1 is called Workstation, PC 2 is called Backup Media PC. When i installed Win 11 Pro on each of these PC’s, PC2 Network picks up only pc1 Computer name i.e. Workstation and displays the Network Icon and on PC2 i, picks up the other PC1 name and display’s the Network Icon and name Workstation.

I am setting PC 2+PC 3 up as a FTP SMB server and the Workstation is just that, Processing data, Videos, and saved on a scratch disk nvme gen 4x4 etc and transferred and stored on both PC2 and PC 3 at the same time.
I am testing the Transfer rates/speeds between the Workstation and all the other pc’s that will be connected via the NVME gen 4 drives on each pc (Workstation pc2 and pc3) PC 3 is still on Rj45 cat8 cable and will be getting upgraded to 25 Gb/s Speeds, when I can Iron out these problems.
The funny thing is when I link up all the above computers to my old RJ45 Ethernet cards they all show up in the Network. When I try to replace one pc at a time with the Connectx4/5 cards.

In answer to your question The Connect x4 /x5 are Ethernet only. I researched before buying these Cards. My Connectx5 was bought from FS. they advised me on this. I also read either on the Q and A on that these are Ethernet only. See the second picture I posted with the Information TAB (forth screen grab, down on the far Left hand side says Port Type ETH . ETH I say would mean Ethernet. That’s why I posted such heavy data for the Experts and people with better knowledge than I have on Networking that might pin point the problem and help me.

Thank you all who have read this post and who have replied to help me.

Gerard

My only thought is that it is driver-related. I’ve had many ConnectX 4/5 NICs and never had a problem, but I’ve rarely used them with Windows (not had a problem there also, but I don’t recall how I installed drivers).

PCI ID from the screenshot is 15B3 1015 0003 15B3, which using /usr/share/hwdata/pci.ids on my machine matches: ConnectX-4 Lx EN, 25GbE dual-port SFP28, PCIe3.0 x8, MCX4121A-ACAT

According to https://network.nvidia.com/files/doc-2020/pb-connectx-4-lx-en-card.pdf , that is a dual port 25GbE card, so agrees with everything else. I don’t see anything unusual in the PCI params or the ethernet params, everything seems normal for Ethernet mode and not in Infiniband mode.

I would only assume it is a problem with the drivers in Windows. I’m absolutely terrible at Windows, so my only advice is: Can you try booting into Linux to see if it shows up?

All Linux kernels for years have had the mlx4/mlx5 drivers for all Mellanox NICs built-in, shouldn’t need to load anything.

lspci should show it without having any modules loaded.

for dev in /sys/class/net/* ; do echo $(basename $dev) : $(basename $(readlink $dev/device/driver)) ; done should show:

eth0 : mlx4_core
eth1 : mlx4_core

ethtool eth0 should show (something like):

Settings for eth0:
	Supported ports: [ FIBRE ]
	Supported link modes:   1000baseKX/Full
	                        10000baseKR/Full
	                        25000baseCR/Full
	                        25000baseKR/Full
	                        25000baseSR/Full
...

Disable automatic driver install on Windows 10

  1. Open Settings.
  2. Click on System.
  3. Click on About.
  4. Under the “Related settings” section, click the System protection option.
  5. Click the Hardware tab.
  6. Click the “Device Installation Settings” button.
  7. Select the “No (your device might not work as expected)” option.

Then install the latest driver from Mellanox and reboot
Then open your device manager and it should look like this


Are you using SFP28 DAC or 25Gbit SFP28 transceiver?

Sorry Sirs for replying so late, real life got in the way. Thank you for your response to my request, I am using official Ubiquiti 1x3m and 1x5m DAC’s and 1x3m generic DAC and 1x5m generic DAC as well plus 1 x 7m Transceiver that is Ubiquiti generic to its connection to my Ubiquiti Aggro Pro Switch and at the other end a Transceiver generic to Mellanox NIC linked my a 7m Fibre Optic connection. This was bought at the suggestion at rep from FSdotcom and works the same.

With an update to the problem It was found out that, after heavy testing with these cards the Switch was faulty. The switch has only 4 x 25 Gbe ports and on port 30 when closing the Network rack door it was pushing down on the 4 DAC leads that I had connected for testing and on port 30 the light went out. On further inspection when I pushed down with very little force it disconnected the port and it died. Take your finger off it it connected again and some time’s not.

I got a replacement switch and now with my testing via Windows 11 with these Mellanox Cards 2x Connect x4 Nic and the Connect x5 Nic I am getting better connection speed tests via WIndows drag and drop with a solid connection.

Sorry answer to your question.
Are you using SFP28 DAC or 25Gbit SFP28 transceiver?
Both Sir.

My test is 2 x NVME FireCuda 1TB SSD. 1 in each PC connected via the Mellanox NiC’s. My Speed is now 1.7Gbe/1.8 Gbe via Windows drag and drop. I am migrating from HHD to SSD or NVME Storage and seeing what is best for me at present. I am thinking of U.3 set up and using the U.3 connection to connect U.2 NVME drives via this Card Broadcom MEGARAID 9670W-16i NVMe/SAS/SATA 16 Port PCI-E 4.0 RAID Controller .

Thank you for your time

Gerard