Epyc Rome Users - Coupled mode 2933mhz vs Uncoupled 3200mhz

Hello everyone,

Quick question, has anyone tested and shown a significant reduction in memory latency on Epyc Rome chips when using 2933mhz Ram (Coupled Mode) vs 3200mhz (Uncoupled mode)? I have been trying to figure this out by checking all the reviewers from the era and servethehome, phoronix, level1 @wendell , ect. didn’t really seem to benchmark this.

For those unfamiliar with the topic, Epyc Rome ran it’s infinity clock speed lower than the max supported ram speed of 3200mhz which added latency. If you ran 2933mhz ram it was called “Coupled Mode” according the AMD docs, which lowered latency. This was later fixing on Epyc Milan.

Anyways… if you have info let me know.

Cheers!

If someone has the 3200 ram on a Rome platform, can they tune it down to 2933 in the BIOS, and see the difference?.. Or does it require the actual 2933 sticks?

CPU Epyc 7F52 8x 32GB RAM
That’s 2933MT/s RAM overclocked to 3200MT/s, but IMC still runs at 2933MT/s. I can’t find “Coupled Mode”, but I think the system is using it automatically.

	DIMMA1	DDR4	MultiBitECC	3200 MT/s	32768 MB	S802C0F220634A9167A	36ASF4G72PZ-2G9E2	Micron Technology
	DIMMB1	DDR4	MultiBitECC	3200 MT/s	32768 MB	S802C0F220634A924F2	36ASF4G72PZ-2G9E2	Micron Technology
	DIMMC1	DDR4	MultiBitECC	3200 MT/s	32768 MB	S802C0F220634A8F688	36ASF4G72PZ-2G9E2	Micron Technology
	DIMMD1	DDR4	MultiBitECC	3200 MT/s	32768 MB	S802C0F21112D98EBBA	36ASF4G72PZ-2G9E2	Micron Technology
	DIMME1	DDR4	MultiBitECC	3200 MT/s	32768 MB	80AD0121178519B44A	HMA84GR7DJR4N-XN	SK Hynix
	DIMMF1	DDR4	MultiBitECC	3200 MT/s	32768 MB	S802C0F220634A90E98	36ASF4G72PZ-2G9E2	Micron Technology
	DIMMG1	DDR4	MultiBitECC	3200 MT/s	32768 MB	S802C0F220634A8BAC0	36ASF4G72PZ-2G9E2	Micron Technology
	DIMMH1	DDR4	MultiBitECC	3200 MT/s	32768 MB	S802C0F220634A90EC7	36ASF4G72PZ-2G9E2	Micron Technology
CPU Epyc 7F52 8x 32GB RAM


################ 3200Mt/s ####################

root@pve:/home/user/Linux# echo 4000 > /proc/sys/vm/nr_hugepages
root@pve:/home/user/Linux# ./mlc 
Intel(R) Memory Latency Checker - v3.11b
Measuring idle latencies for random access (in ns)...
                Numa node
Numa node            0
       0         119.1

Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads        :      146471.3
3:1 Reads-Writes :      138848.3
2:1 Reads-Writes :      140488.5
1:1 Reads-Writes :      141013.6
Stream-triad like:      142600.9

Measuring Memory Bandwidths between nodes within system 
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
                Numa node
Numa node            0
       0        146168.9

Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject  Latency Bandwidth
Delay   (ns)    MB/sec
==========================
 00000  163.79   145131.8
 00002  164.70   144807.0
 00008  164.17   144670.7
 00015  164.96   144364.9
 00050  161.97   144561.8
 00100  147.30   116455.2
 00200  133.99    66158.1
 00300  131.09    46819.9
 00400  128.03    35871.7
 00500  127.30    29244.8
 00700  125.61    21285.5
 01000  125.11    15165.6
 01300  125.10    11802.2
 01700  125.37     9174.3
 02500  126.34     6396.9
 03500  126.79     4713.3
 05000  127.53     3452.0
 09000  128.50     2139.5
 20000  129.47     1233.7

Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT  latency        125.3
Local Socket L2->L2 HITM latency        125.5

**########### 2933MT/s ###################




root@pve:/home/user/Linux# ./mlc 
Intel(R) Memory Latency Checker - v3.11b
Measuring idle latencies for random access (in ns)...
                Numa node
Numa node            0
       0         118.8

Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads        :      143089.7
3:1 Reads-Writes :      137099.9
2:1 Reads-Writes :      139180.9
1:1 Reads-Writes :      140306.7
Stream-triad like:      139655.8

Measuring Memory Bandwidths between nodes within system 
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
                Numa node
Numa node            0
       0        143965.7

Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject  Latency Bandwidth
Delay   (ns)    MB/sec
==========================
 00000  163.53   142517.5
 00002  163.87   142287.2
 00008  163.83   142029.4
 00015  163.47   141836.7
 00050  161.77   141864.5
 00100  137.57   115385.5
 00200  124.49    65952.2
 00300  130.69    46506.1
 00400  128.59    35551.7
 00500  127.14    28874.8
 00700  126.37    20991.5
 01000  125.29    14992.6
 01300  125.67    11691.7
 01700  125.39     9101.9
 02500  126.23     6364.2
 03500  126.63     4690.9
 05000  127.59     3432.2
 09000  128.23     2130.8
 20000  129.40     1229.0

Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT  latency        125.3
Local Socket L2->L2 HITM latency        125.4
root@pve:/home/user/Linux# **
2 Likes

Can anyone replicate this result with a Rome system?
It seems too low to me, shouldn’t 8x 3200MTS achieve around 200GB/s?

edit:
checked myself, the IMC still runs with 2933, so the 3200MT/s does not help

I don’t have numbers, but with a 2700X I had lower latency in a benchmark with 2933 vs 3200, slightly higher bandwidth or speed with 3200 same benchmark, but no real-world difference boot speed/general OS/games (I used 3200)

3200 was the highest I could use and 2933 was a convenient drop-down option when testing (it sounded more interesting than 3000 :stuck_out_tongue: )

1 Like