Return to Level1Techs.com

Memory Unleashed on Threadripper: 128gb & 2933 & ECC tested | Level One Techs


#21

There are a lot of 2933 scores that are better on my setup, and the ones that are worse are worse than I would think they should be.

The scary part is that all tests were done at the same time on the same boot. Literally just the same benchmark line over and over again.

So I’m not sure what I would have screwed up that made some things faster and some thing slower? Will look into it.


#22

I think the ‘stream’ test uses fairly small memory operations, which you seem somewhat faster at. Outside of the ‘stream’ tests, it becomes interesting with a more obvious pattern.

I ran through the same bunch of tests on Gentoo and other than being a whole lot faster, the actual ratios between the various scores were still in line with Fedora.

Something strange is afoot.
Let me know if you need any extra info.


#23

try changing the interleaving settings to 256 or 512 bytes and see if that more closely matches my results? I think I was on 3.31 which had some bugs and wonder if that was different. the 2990 is always in numa mode BUT you still have control over how many bytes are interleaved across each dimm on each node


#24

I have options for interleave type (None,Channel,Die,Socket,Auto), but no options for setting the size that I can find. Sorry.

Motherboard is MSI MEG X399 CREATION.


#25

It might be under nbio or memory options?


#26

I’ve been through the options three times now. If it’s there, I can’t find it. :frowning:


#27

Doing this but soon I’ll do that


#28

I was just about to mention exactly this.

My setup (x399 Taichi with 8x single rank 8gb ecc ram at 2933 14-15-13-13-31-45) has:
Memory Interleaving = Channel
Memory interleaving size = 512
Channel interleaving hash = enabled

This is under “Advanced->AMD CBS->DF Common Options”

These setting will have an impact on performance, though frankly I couldn’t tell you how or why. On asrock boards with a newer bios (I’m on 3.20) these must manually be set to be sure of the setting.


#29

I feel certain that explains both the better and worse results – the defaults with the 2990 on that bios are probably different


#30

After a bit of google searching, it looks like there are quite a few motherboards that don’t have the menu options enabled. People have taken to hacking various BIOS’s (none for my board) to enable the menus. MSI seems to be notorious for not providing the options.

Is there anyway to read the current settings via the OS, perhaps an EFI var.

Hacking at the BIOS is always an option, though a lot of effort just to run a test. Plenty of rabbit holes along that trail too.


#31

https://openbenchmarking.org/result/1810142-FO-MEMORY29908

loaded setup defaults, then loaded xmp, re-ran test. I didnt look for anything obviously wrong in uefi yet, will do that in a sec and post images.


#32

I’ve run another bunch of benchmarks, very interesting, but no closer to finding where thing differ. I don’t have the option to change interleaving size, but I was curious to see the result of each interleaving option (all five).

There were some minor differences in my various memory subtimings when compared to the posted images above, so I reconfigured them to match. Suddenly, the ‘stream’ tests where I was slower shot to the lead. I was half expecting the scores where I had been faster to drop to match the suspect machine. Nope, the darn thing just went faster.

For the final benchmark run, I simply rebooted to Gentoo and ran the test again, with no BIOS changes. Yeah, I got sick of the Ubuntu score thinking it owned the place, had to set it straight.

https://openbenchmarking.org/result/1810159-SK-1810157FO74

In the MBW tests, the suspect machine goes slower as transfer sizes increase, I find this odd. There should be less overhead with larger blocks. My system doesn’t follow this trend. It’s subtle but I think it’s related.

I’ll post the subtiming details shortly, just have to tend to some other things first.


#33

added 256 byte results
https://openbenchmarking.org/result/1810155-FO-MEMORY29933


#34

Here are the timings that are different between the boards/RAM profiles.

Signal Wendell CrayZeApe
tRC 48 69
Twrrd 5 4
tCKE 5 8
ProcODT 8 60 Ohm

Edit: Removed invalid tRFC4 signal entry from table.


#35

here are results with 2k interleave (instead of 256 bytes)

didnt have time to test all settings.

I would think rfc4 being so much lower would be a big deal. It may be that twrrd being 5 is the only real difference?

https://openbenchmarking.org/result/1810154-FO-MEMORY29909


#36

From my own rough and dirty half assed testing (using various tools on windows) when I was beginning overclocking, rfc4 never made a noticeable difference even after lowering 100 from what was automatically set, so I just stopped touching it.

That said, Stilt (a very prominent ram overclocker) mentioned it was one of the important subtimings early on when Ryzen had come out.

To complicate things, I strongly suspect that AM4/tr4 motherboards will outright ignore some settings, or have various hidden things going on that results in the same practical effect.

I’m considering doing some comprehensive testing of my own, since I’ve once again decided on a whim to blow up my old setup and screw around with MX17. Because I hate evenings, my goal is to have the system boot to a ZFS partition, though as a Linux newbie this will be slow going.

But if you guys give me a quick rundown on how to use the phoronics test suite (give me the commands to copy/paste or the scripts to automate everything) I could do some thorough incremental testing, and just use my backup computer(s) for a bit. We’ve got a compost pile going, might as well shit on it myself.

How long does a set of tests take to run, and are you doing multiple runs?


#37

I messed up on the tRFC4 value as I only have access to tRFC and hadn’t worked out what I needed to know yet.

So 514 / 1.34 / 1.625 = 236.0505 which is pretty close to Wendell’s tRFC4 value of 235. I’ll remove it from the table as it’s essentially the same. I’ll set it back to 514 and rerun the test.

Typing ‘phoronix-test-suite’ with no parameters gives a fairly easy to understand help display, but here’s some useful commands.

You can automate a benchmark run against a previously uploaded result, for instance, Wendell’s latest update linked above, by using the ID from the URL as follows (without quotes).
“phoronix-test-suite benchmark 1810154-FO-MEMORY29909”

Run a particular test (eg. x264):
“phoronix-test-suite benchmark x264”

List all tests:
“phoronix-test-suite list-all-tests”

Upload a result if you said ‘No’ when initially asked:
“phoronix-test-suite upload-result {test ID}”

Rename test identifiers (also lets you add characters like ‘*’ etc. which are ignored upon initial naming)
“phoronix-test-suite rename-identifier-in-result-file {test ID}”

Change the order in which results are displayed:
“phoronix-test-suite reorder-result-file {test ID}”

As I’m going to rerun against Wendell’s latest update, I’ll time it and post a more accurate time when finished, though it’s somewhere in the vicinity of an hour for Wendell’s batch of tests. Other tests vary in duration, though most are fairly short.


#38

I was a little off on the time, it took about half an hour.

real 28m56.377s
user 170m34.660s
sys 2m53.299s

I lost what I’d gained on the small transfer sizes, but now the timing signals match Wendell’s properly.

https://openbenchmarking.org/result/1810151-FO-1810154FO33


#39

Thanks, that intro was perfect. Doing a test run now while I attempt to make homemade gummi bears and decide what sort of permutations I want to attempt. The estimated run-time is way off, should definitely not take very long.

2018-10-19 Edit
In case anyone is actually waiting for results, I got distracted by a shit ton of “new guy actually figuring out linux for real this time” problems.
I have benchmarks in process now, though I’m forcing a minimum of 10 runs and 4 loops for these tests, across some incremental bios changes, so it’s gonna be a few days.


#40

The estimated run-time is always way off to begin with, it ‘learns’ a more accurate idea of time after a test is run a few times.