Connectx-4 Lx 2x 25GbE with ASPM support + idle power consumption measurements

I’ve recently got myself 2 Connectx-4 Lx cards. I wanted to replace very sketchy connectx-3 cards. (OCP2.0 cards on pcie adapters which negotiate only pcie3x2 D:)
I’ve found decent cards on aliexpress for 62 EUR without taxes. It’s a steal at that price.

Out of the box card didn’t support ASPM. I’ve seen some people on servethehome talking about ASPM support being firmware dependant. Thankfully my cards were not OEM locked and I’ve updated the firmware and got ASPM working on both cards.

Here’s some info for people planning power usage in their low idle power servers.

Specs of my cards.
model: MCX4121A-ACAT
PSID: MT_2420110034
board revision: AB
Manufacturing date: 2017-04-23

Newest firmware version I installed
Firmware version 14.32.1010

I took some measurements of my Asrock n100dc-itx with and without this card.
Power measuring device: Shelly plug S
Power measured at the wall
Power adapter: some 2nd hand fujitsu 19v 90w
PC specs:
n100dc-itx + 32gb ram at 2400MTs + 512GB samsung PM981 + 1 Gb Ethernet link active

This testing setup has a limitation, n100dc-itx pcie slot has only 2 lanes of pcie 3.

Idle measurement with powertop --auto-tune: 7,10 W
Idle measurement with powertop --auto-tune + connectx-4 Lx no connection: ~13.12 W

Idle measurement with powertop --auto-tune + connectx-4 Lx + 1 SFP+ DAC cable and interface up: ~14,65 W

Idle measurement with powertop --auto-tune + connectx-4 Lx + 2 SFP+ DAC cable and interface up: ~15,46 W

Idle card adds 6,02W to this system
Idle card with 1 SFP+ connections adds 7,55W to this system
Idle card with 2 SFP+ connections adds 8,36W to this system

I don’t have sfp28 cables to test power usage at faster link speeds. Maybe I’ll revisit this at some point.

I’m not sure how much of the power usage delta comes from the card itself and not SOC pcie and other stuff. Card is not very hot at idle, I haven’t stress tested it because it would be pointless in this motherboard. (and hard to isolate NIC power usage during iperf)

Powertop shows that board reaches package C3 power state. The same result with and without the card.

lspci ASPM info

lspci -s 01:00.0 -vv |grep -i aspm
		LnkCap:	Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L1 <4us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
11 Likes

Thank you this is extremely helpful! Do you mind sharing the exact vendor link you bought your cards from so I can make sure I get the exact same ones? As recently as a few months ago people in this reddit thread are complaining of being unable to get ASPM working on their cards: https://www.reddit.com/r/homelab/comments/15eqm9s/mellanox_connectx4_questions_before_buying/

I don’t want to end up in that situation.

I’ve ordered this item 2 times and got 2 good cards from this seller. First card looked very generic. Second time it had IASER serial number tag, it was probably from Inspur branded device.
Their supply may be mixed.

https://aliexpress.com/item/1005005048786017.html

I’ve bough third one but I’m still waiting for the package. I can post results once it arrives.

1 Like

My third card arrived. Exactly the same model.

It came with firmware 14.18.1204 I’ve updated it to 14.23.1010
and card started to report L1 as available. The same as my previous cards.

LnkCap:	Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L1 <4us

Interesting. I just received what looks to be a genuine card from an ebay seller.

The card came with an old firmware and accepted the latest update directly from nvidia without putting up any fight:

flint -d /dev/mst/mt4117_pciconf0 query
Image type:            FS3
FW Version:            14.24.1000
FW Release Date:       26.11.2018
Product Version:       14.24.1000
Rom Info:              type=UEFI version=14.17.11 cpu=AMD64,AARCH64
                       type=PXE version=3.5.603 cpu=AMD64
Description:           UID                GuidsNumber
Base GUID:             98039b0300ad6554        4
Base MAC:              98039bad6554            4
Image VSD:             N/A
Device VSD:            N/A
PSID:                  MT_2420110034
Security Attributes:   N/A

flint -d /dev/mst/mt4117_pciconf0 -i fw-ConnectX4Lx-rel-14_32_1010-MCX4121A-ACA_Ax-UEFI-14.25.17-FlexBoot-3.6.502.bin burn

    Current FW version on flash:  14.24.1000
    New FW version:               14.32.1010

FSMST_INITIALIZE -   OK
Writing Boot image component -   OK
Restoring signature                     - OK
-I- To load new FW run mlxfwreset or reboot machine.

flint -d /dev/mst/mt4117_pciconf0 query
Image type:            FS3
FW Version:            14.32.1010
FW Release Date:       1.12.2021
Product Version:       14.32.1010
Rom Info:              type=UEFI version=14.25.17 cpu=AMD64,AARCH64
                       type=PXE version=3.6.502 cpu=AMD64
Description:           UID                GuidsNumber
Base GUID:             98039b0300ad6554        4
Base MAC:              98039bad6554            4
Image VSD:             N/A
Device VSD:            N/A
PSID:                  MT_2420110034
Security Attributes:   N/A

Unfortunately ASPM seems to be giving me trouble, I suspect it’s something to do with my motherboard.

lspci
01:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
01:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]

lspci -vvv -s 01:00.1 | grep -i aspm
                LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L1 <4us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+

lspci -vvv -s 01:00.0 | grep -i aspm
                LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L1 <4us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+

The system still achieves C3 on a clean boot with no VM’s running and the card installed though so that’s good news.

1 Like

If LnkCap: shows ASPM support your card should be fine.
I had similar situation on sketchy chinese erying motherboard.
You could try different motherboard or play with settings in configuration menu.

I also have these same cards ordered from Ebay (CX4121A MCX4121A-ACAT Mellanox ConnectX-4 Lx 25GbE) How did you update the firmware? I have only linux command line available.
EDIT: was able to update firmware and now aspm is available. Power consumption on the other hand didnt go down at least not yet. I have 40w idle power consumption from the wall with ryzen 7900

Can you tell us how you did it ?

1 Like

Just in case that would be helpful for some one, I’ve used this script to update my CX-4/5/6/7/BF2 cards:

sudo apt install mstflint gawk
NEW_FIRMWARE_BIN="CHANGE ME TO THE PATH TO BIN FILE"
# Assume we have only one ConnectX-card and we want to flash only that card
PCI_ID=$(sudo lspci | gawk '($0 ~ /ConnectX/ && $1 ~ /\.0$/){print $1}' | head -n 1)
CARD_MODEL=$(sudo lspci | gawk '($0 ~ /ConnectX/ && $1 ~ /\.0$/){print gensub(/[\]\[]/, "", "g", $8$9)}' | head -n1)
BACKUP_DIR="${CARD_NAME}_${PCI_ID}_backup"
mkdir -p "${BACKUP_DIR}"
sudo mstflint -d "${PCI_ID}" query full > "${BACKUP_DIR}"/full_query.txt
sudo mstflint -d "${PCI_ID}" hw query > "${BACKUP_DIR}"/hw_query.txt
sudo mstflint -d "${PCI_ID}" ri "${BACKUP_DIR}"/orig_firmware.bin
sudo mstflint -d "${PCI_ID}" dc "${BACKUP_DIR}"/orig_firmware.ini
sudo mstflint -d "${PCI_ID}" -i "${NEW_FIRMWARE_BIN}" -allow_psid_change burn
sudo mstfwreset -d "${PCI_ID}" reset
1 Like

Late to the Party but you can try Follow the Instructions by Z8 (dot re) on how he Upgraded the Mellanox ConnectX-4.

I basically put all those Instructions on a BASH Script on my NAS so I can replicate it on several Machines:

#!/bin/bash

# Abort on Error
set -e

# Determine toolpath if not set already
relativepath="./" # Define relative path to go from this script to the root level of the tool
if [[ ! -v toolpath ]]; then scriptpath=$(cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd ); toolpath=$(realpath --canonicalize-missing ${scriptpath}/${relativepath}); fi

# Get Card Serial Number
serialnumber=${1-""}

# Ask User Interactively
while [[ -z "${serialnumber}" ]]
do
    read -p "Enter Device Serial Number (can be found with lspci -vvv -s 01:00.0 | grep Serial): " serialnumber
done

# Get Card Device Name
mstdevice=${2-""}

# Ask User Interactively
while [[ -z "${mstdevice}" ]]
do
    read -p "Enter Network Adapter MST Device Name (can be found with ls -l /dev/mst/): " mstdevice
done

# Generate Timestamp
timestamp=$(date +"%Y%m%d_%H%M%S")

# Define Backup Location
backuplocation="/tools_local/mellanox/backup/${serialnumber}_${timestamp}"

# Create Folder for Backup
mkdir -p "${backuplocation}"

# Change to Backup Folder
cd "${backuplocation}" || exit

# Backup Information
flint -d /dev/mst/${mstdevice} query full > flint_query.txt
flint -d /dev/mst/${mstdevice} hw query > flint_hwinfo.txt
flint -d /dev/mst/${mstdevice} ri orig_firmware.bin

nano flint_hwinfo.txt
nano flint_query.txt

flint -d /dev/mst/${mstdevice} dc orig_firmware.ini

# This will FAIL if the Devices does NOT have a ROM Flashed
#flint -d /dev/mst/${mstdevice} rrom orig_rom.bin

# Flash New Firmware
# (-allow_psid_change is only needed for cross-flashing (e.g. Huawei Firmware -> Mellanox Firmware))
flint -d /dev/mst/${mstdevice} -i ${toolpath}/firmware/fw-ConnectX4Lx-rel-14_32_1900-MCX4121A-ACA_Ax-UEFI-14.25.17-FlexBoot-3.6.502.bin -allow_psid_change burn

# Flash New Firmware
# flint -d /dev/mst/${mstdevice} -i ${toolpath}/firmware/fw-ConnectX4Lx-rel-14_32_1900-MCX4121A-ACA_Ax-UEFI-14.25.17-FlexBoot-3.6.502.bin burn

# Reset Network Adapter
mlxfwreset -d /dev/mst/${mstdevice} reset

# Change back to original Path
cd ${toolpath} || exit

You’ll need to install at the very least the mstflint Open Source Tool from NVIDIA / Mellanox.

As for the Firmware, you can grab that on Mellanox Website, check Z8 download URL in his Post otherwise (there is a slighly more recent Version available though, I used fw-ConnectX4Lx-rel-14_32_1900-MCX4121A-ACA_Ax-UEFI-14.25.17-FlexBoot-3.6.502.bin).

EDIT 1: The Forum won’t allow me to post Links, so of course it’s going to be a bit more complicated like that …

Hey, I am curious if anyone has gotten ASPM to work on faster cards than 25gbe?

I’m pondering the purchase of 40/56/100 Connectx-4 cards. Specifically I am looking at these models

  • MCX456A-ECAT - 100gb dual QSFP28
  • MCX414A-GCAT - 50gb dual QSFP28
  • MCX516A-BDAT - 50gb dual QSFP28 (Connectx-5)
  • MCX555A-ECAT - 100gb dual QSFP28 (Connectx-5)
  • MCX546A-CDAN - 100gb dual QSFP28 (Connectx-5)

Not the Model(s) you mentioned, but this one does NOT support ASPM:

Mellanox ConnectX-4 CX416A MCX416A-CCAT Dual Port 100GbE QSFP28 → ASPM NOT working

Did you manage to get ASPM working with this Mellanox ConnectX-4 LX 25gbps SFP28 NIC if the NIC is installed in a CPU directly-connected PCIe Slot ?

Until now, no matter what I did (and I tried A LOT), I could only get the Mellanox ConnectX-4 LX 25gbps SFP28 NIC to work with ASPM enabled if the NIC is installed in a PCH/DMI/Chipset connected PCIe Slot.

In a CPU directly connected Slot, it will not allow ASPM, not matter what. Or, maybe better put, it will allow ASPM, but the System will be anyways stuck at PC2 or PC3 for whatever Reason :frowning_face:. And on my Old Systems that I tested, it does NOT seem like Multi VC is to blame (as it is on much newer Platforms like 12th/13th Intel Gen).

I tested on Supermicro X10SLL-F, Supermicro X11SSL-F and Fujitsu TX1320 M3, all experiencing the same Behaviour. Including after Patching BIOS Settings and even trying to unlock some hidden ones.

I have seen the same behaviour on my 25 Gbit/s 4 Lx cards – ASPM is supported in current firmwares but neither does it affect the power consumption of the card itself nor does it allow the CPU to reach higher package C states (if installed in a CPU slot).