Iperf benchmark on pfsense box, bad dual link performance (1st run)

I have a 82571EB/82571GB Gigabit Ethernet Controller (dual port card, with a pcie x4 interface)
when i do a dual link performance test i am getting lower speeds than expected, a subsequent run gives better, but still lacking

This card it in the top x16 slot of a Gigabyte GA-A55M-DS2 rev 1.1 motherboard with a A6-3500

  • note that i have down clocked the graphics core as low at it will go in the bios as i suspect it’s VRM can’t handle the transient loads anymore (but that should not be a issue for headless operation and outputting a tty over VGA sometimes)
    for some reason my 1st run using dual link mode preforms quite slow for dual link, am i bottlenecked by some power saving feature? or is this some cpu cache speed/capacity thing with these old budget chips

I do have 2 consumer unmanaged switches between these systems, one has never given me a issue and the other i have bypassed with the same results and i have tested used different systems with iperf over my other switch and never saw this behavior, so it has to be something with this ~10 year old hardware, bit i am just wondering what

[2.6.0-RELEASE][[email protected]]/root: iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 63.7 KByte (default)
------------------------------------------------------------
[  2] local 10.0.0.3 port 5001 connected with 10.0.0.50 port 47948
------------------------------------------------------------
Client connecting to 10.0.0.50, TCP port 5001
TCP window size: 88.2 KByte (default)
------------------------------------------------------------
[ *1] local 10.0.0.3 port 27602 connected with 10.0.0.50 port 5001 (reverse)
[ ID] Interval       Transfer     Bandwidth
[ *1] 0.00-10.02 sec  1.06 GBytes   909 Mbits/sec
[  2] 0.00-10.03 sec   440 MBytes   368 Mbits/sec
[SUM] 0.00-10.03 sec  1.49 GBytes  1.28 Gbits/sec
[  4] local 10.0.0.3 port 5001 connected with 10.0.0.50 port 47950
------------------------------------------------------------
Client connecting to 10.0.0.50, TCP port 5001
TCP window size: 64.2 KByte (default)
------------------------------------------------------------
[ *3] local 10.0.0.3 port 33948 connected with 10.0.0.50 port 5001 (reverse)
[ ID] Interval       Transfer     Bandwidth
[  4] 0.00-10.03 sec   906 MBytes   758 Mbits/sec
[ *3] 0.00-10.02 sec  1.04 GBytes   890 Mbits/sec
[SUM] 0.00-10.02 sec  1.92 GBytes  1.65 Gbits/sec
[  6] local 10.0.0.3 port 5001 connected with 10.0.0.50 port 47952
------------------------------------------------------------
Client connecting to 10.0.0.50, TCP port 5001
TCP window size: 96.2 KByte (default)
------------------------------------------------------------
[ *5] local 10.0.0.3 port 18720 connected with 10.0.0.50 port 5001 (reverse)
[ ID] Interval       Transfer     Bandwidth
[ *5] 0.00-10.02 sec  1.05 GBytes   896 Mbits/sec
[  6] 0.00-10.03 sec   753 MBytes   629 Mbits/sec
[SUM] 0.00-10.03 sec  1.78 GBytes  1.52 Gbits/sec
[  8] local 10.0.0.3 port 5001 connected with 10.0.0.50 port 47954
------------------------------------------------------------
Client connecting to 10.0.0.50, TCP port 5001
TCP window size: 96.2 KByte (default)
------------------------------------------------------------
[ *7] local 10.0.0.3 port 62748 connected with 10.0.0.50 port 5001 (reverse)
[ ID] Interval       Transfer     Bandwidth
[ *7] 0.00-10.02 sec  1.03 GBytes   886 Mbits/sec
[  8] 0.00-10.03 sec  1017 MBytes   851 Mbits/sec
[SUM] 0.00-10.03 sec  2.03 GBytes  1.74 Gbits/sec

How exactly are you load balancing traffic across two 1Gbps links?

I don’t think that old hardware is at fault or too slow for 2x1Gbps… if a Raspberry Pi, or an old Pentium n3150 can do 1+1 Gbps bidirectional nat routing… I don’t think a “full fat” A6-3500 should have issues.

1 Like

to my knowledge there should not be any load balancing configured

X470 Gaming Plus w/ R5 3600 (NIC: Intel Corporation 82572EI Gigabit Ethernet Controller)

  • Kubunut 22.04

unmanaged gigabit switch (Asus GX-D1081; VIP port goes to my old router running DD-WRT)

pfsense box (not deployed yet, DHCP is disabled and is on a spare static IP for testing/configuration)

EDIT: If i boot a different OS on the a6-3500 i am unable to reproduce this

By load balancing I was thinking about those results showing that you’re getting more than 1Gbps on iperf over a 1Gbps per port card. So… how is this working?

It should be 1GB up and 1GB down when using a dual link eg: iperf -c 10.0.0.50 -d

$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  1] local 10.0.0.50 port 5001 connected with 10.0.0.132 port 39386
------------------------------------------------------------
Client connecting to 10.0.0.132, TCP port 5001
TCP window size: 1.23 MByte (default)
------------------------------------------------------------
[ *2] local 10.0.0.50 port 47716 connected with 10.0.0.132 port 5001 (reverse)
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-10.0267 sec   872 MBytes   729 Mbits/sec
[ *2] 0.0000-10.0381 sec  1.09 GBytes   937 Mbits/sec
[SUM] 0.0000-10.0267 sec  1.95 GBytes  1.67 Gbits/sec
[  3] local 10.0.0.50 port 5001 connected with 10.0.0.132 port 39390
------------------------------------------------------------
Client connecting to 10.0.0.132, TCP port 5001
TCP window size: 1.10 MByte (default)
------------------------------------------------------------
[ *4] local 10.0.0.50 port 47718 connected with 10.0.0.132 port 5001 (reverse)
[ ID] Interval       Transfer     Bandwidth
[ *4] 0.0000-10.0348 sec  1.09 GBytes   937 Mbits/sec
[  3] 0.0000-10.0277 sec   880 MBytes   736 Mbits/sec
[SUM] 0.0000-10.0277 sec  1.95 GBytes  1.67 Gbits/sec

NIC at 10.0.0.50: 24:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06)
NIC at 10.0.0.132: 05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 09)

I have my switch and a WZR-300HP router (Firmware:[ DD-WRT v3.0-r28444 std (12/05/15)) between these 2 systems and everything is using cat5e cable

  • the wifi has not acted up on that firmware version and has been solid since it was released

Oh, I see, I’d usually do iperf3 ... --bidir and that slightly different output. … I thought you were bonding multiple links at first.

So, you’re able to send at 937 Mbps (1Gbps Ethernet - VLAN framing - IP - TCP), but can only receive at 750ish while sending full speed?

I think it could be interrupt processing not being fast enough our buffers being too small – often time context switching into the network driver can take more CPU work than processing a packet.

This could be power saving related issues (all of a sudden youd have a 500MHz CPU instead of whatever you normally have, you can try disabling), or maybe you don’t have rx interrupt coalescing turned on which would allow the nic to buffer multiple frames before asking the driver for attention via interrupt. Check what features you have enabled with ethtool.

Also you can try pinning interrupts from the device onto always landing on the same core using either numactl or manually with /proc/irq//smp_affinity - (mplies disabling irqbalance). This would keep various caches inside the CPU core warm between packets.

So it seems pfsense does not have a ethtool command and /proc is empty

i am guessing power management is just buggy on this platform as it was never popular, i cant even tell in software what the cock speeds are or if the clock speed can even boost to 2.4Ghz

the best way i have to check cpu power usage is my clap on amp meter (the only one i can trust)

There are clearly some power management issues with pfsense on this hardware

Ubuntu
3.5A - 42W - Prime95
0.45A - 5.4W - idle

pfsense
3.8A - 45.6W - prime95
0.8A - 9.6W - idle

pfsense uses almost 80% more power idle

powerd does not seem to do anything

note that these readings were done using a clamp meter, not software