Return to Level1Techs.com

Threadripper 2000 series /thread


#105

I think it’s the only ECC 2666 available out there that is officially supported by X399 boards.


#106

yes you do… @ 4.1 GHz all-core I’m routinely seeing 450-500W sustained in benchmark style loads. (2990WX)

Now water cooled… The 14S did a good job with 2 fans @ 100%, but 500W is just too much to ask it to sustain (no surprise).

3200+ is definitely a must… I’ve been wondering about higher clock and higher CAS ram? Intel doesn’t react well in real-world usage to higher latency, so its not worth it to go to 3600 if that also takes you to CAS 16 or higher (vs 3200/14). I’m wondering if that’s still true given the IF?

It’s pretty frustrating trying to tune this thing in linux - all my tools for measuring things are borked to some degree or the other. So, I broke down and installed windows to tune and now back to linux to see if its really-really stable…

Have not cracked the “best of both worlds” nut - still stuck with all-core 4.1GHz and no XFR+. This chip is asking for 1.45-1.5+ based on VID when it goes above 4.1?!? Yikes! (watching Ryzen master in “auto” is terrifying).


#107

Remember the measured voltage is not the same as voltage on chip. Resistance through the socket is an actual concern at these wattages.

My observation on asrock testing so far is that opening up the wattages and keeping it cool results in better/higher and more consistent clocks. Just letting xfr and pbo do it’s thing.

Phoronix tests are within 3% +/- (most less than that some vary more) vs manual 4.1 all core OC.

If I manual oc my I/o throughput suffers and I think the platform has more errors. 2933 can be a sweet spot with low cas latency. On a sample size of like 5 my experiences have been generally better running around 2933-3200 with a lower cas latency than 3600 and above.

I have seen performance degradation in excess of 10% at 3600

I have gskills new 2933 cas 14 128gb kit and let me tell you that is some sweet sweet awesomeness.


#108

100% Agreed - which is why I am not thrilled at what PBO is doing (1.5v+) That’s kinda terrifying coming from Intel OC land (chip cooking numbers)…

Where I am with that at the moment is that XFR+ has either been unstable once power levels are raised or voltages become terrifying (see above).

Some of that instability may have been garbage USB ports on this MSI board. It does not have the required current capability to service the provided ports. 2 MSFT wireless keys (keyboard/mouse) and one USB 3.1 high-speed thumb drive (SanDisk 150MB/s) and it starts flaking out. I did not realize how badly until I tried to install linux on a thumb drive from a thumb drive at stock clocks… hilarity ensued…

I plugged in an internal USB header and had no issue… Something very broken around the back of this board…

I’m using B-die 3200C14 (TridentZ kit designed for Skylake). Thus far at 3200C14 CR1 its been stable… Skylake would not do CR1 - only CR2, but I ended up tuning 2nd/3rd timings on Skylake to compensate and I have not done that yet here… (tFAW, tRFC) etc…

I suspect the 3600 thing is package power… The higher clock rate drives IF speeds which means the SoC voltage likely needs to go up to compensate, but that’s just a hunch. I’m curious if a 3600C15 kit would still produce those perf drops if power were sorted. As AT pointed out the IF with 4 dies represents a LOT of power relative to the cores.

Now that the USB is sorted, I’m going to do some linux runs with my windows tuned values and see where I stand with XFR vs static OC…


#109

Interesting note… Setting a 40 or 41x multiplier in the BIOS with 4.18 kernel still tunes down the clock when idle (though not all the way to the 1.8GHz range I see at stock) - and then it pegs at 40 or 41 respectively when loaded as it should:

cpu MHz         : 3222.808
cpu MHz         : 3218.207
cpu MHz         : 3217.166
cpu MHz         : 3214.372
cpu MHz         : 3109.232
cpu MHz         : 3217.583
cpu MHz         : 3221.837
cpu MHz         : 3210.706
cpu MHz         : 3226.198

In windows 10 its 4GHz or 4.1GHz depending on the BIOS setting full time…

EDIT: some Geekbench data:
https://browser.geekbench.com/user/cekim

4.1GHz all-core:
single: 5322 mult: 62335 peak temp: 59C (1.325v BIOS setting

PBO - running wild with a 600W limit: (auto voltage BIOS setting)
single: 5394 multi:60040 peak temp: 66C

I did Indeed see 4.2GHz+ in the PBO example, but all-core tended to be in the 3.95GHz range where the 4.1GHz fixed setup was using a lower voltage and produced a better over-all multi-core result at a lower temp. (full-custom loop - extra thick 360 rad - pump/fans at 100%)


#110

Try running an io bench at 4.1 vs 3.95. in my case the io suffered greatly at 4.1 but not at all on auto


#111

Would you expect the MEG to allow the CPU to clock itself higher more often? Does the CPU (or can it) take into account the quality of its own power it’s getting somehow?

Also, OC3D reported running memory on 3200 on the MEG but only being able to get 2933 on the Asus for example.


#112

Have a specific tool in mind for that?

FIO seems to be going entirely the other way:

diff fio_auto.log fio_4100.log | grep "/s"
<   write: IOPS=308k, BW=1201MiB/s (1260MB/s)(32.0GiB/27280msec)
>   write: IOPS=588k, BW=2296MiB/s (2407MB/s)(32.0GiB/14274msec)
<    bw (  KiB/s): min=  128, max=676577, per=1.88%, avg=23090.51, stdev=53845.44, samples=2856
>    bw (  KiB/s): min=22912, max=749336, per=1.93%, avg=45366.37, stdev=77455.84, samples=1442
<   WRITE: bw=1201MiB/s (1260MB/s), 1201MiB/s-1201MiB/s (1260MB/s-1260MB/s), io=32.0GiB (34.4GB), run=27280-27280msec
>   WRITE: bw=2296MiB/s (2407MB/s), 2296MiB/s-2296MiB/s (2407MB/s-2407MB/s), io=32.0GiB (34.4GB), run=14274-14274msec

p.s. this tickled me - CBR15 on linux (beat my top windows score):


#113

yeah, that 128gb kit I had on the zenith extreme was super temperamental to get working right but it worked ok elsewhere.

Im hoping this new gen TR specific memory resolves all that. I really do like asus, and their bios is usually pretty amazing


#114

I read that setting memory allocator to SLAB instead of SLUB improves the performance in a NUMA setup. I don’t know if it will help in this case.


#115

Maybe, maybe not, nothing is really guaranteed when it comes to overclocking.
Its just a matter of luck.
The Msi MEG board itself should be capable, but then again it will also depending on the said cpu and memory kit.


#116

How much CB score did Linux give you?


#117

Running numad? I noticed some processes getting pinned to cores that didn’t have memory , weirlldly, still looking into that


#118

As i stated, if you are impacted by memory bottlenecks and need 32 cores, you are an EPYC customer.

Stop blaming threadripper for trade-offs made to get the cost down for 32 cores in a desktop at this price point, and buy appropriate CPU for job. AMD make them. They exist.

Yes it is more money. It is still far cheaper than the intel alternative.


#119

2990WX BIOS settings are fixed to numa per my understanding - you can’t run any other way as you can with 1950/2950. Your options are numa, numa, or numa in the BIOS.

I am not running the numad (user daemon):

  Loaded: loaded (/usr/lib/systemd/system/numad.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

I’ve not done anything with the kernel - stock 4.18.5-1 CentOS7 kernel-ml:
4.18.5-1.el7.elrepo.x86_64

numactl (not previously installed on this machine until you asked) says:

--show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 
cpubind: 0 2 
nodebind: 0 2 
membind: 0 2 
--hardware
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
node 0 size: 32130 MB
node 0 free: 11234 MB
node 1 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
node 1 size: 0 MB
node 1 free: 0 MB
node 2 cpus: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
node 2 size: 32226 MB
node 2 free: 13396 MB
node 3 cpus: 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
node 3 size: 0 MB
node 3 free: 0 MB
node distances:
node   0   1   2   3 
  0:  10  16  16  16 
  1:  16  10  16  16 
  2:  16  16  10  16 
  3:  16  16  16  10 

See picture in my comment: 6566 is the WINE + Linux score. Best windows score is 6514 - though that’s with setting process priority to “real-time” and killing off the usual cycle vampires in windows task manager that one does to benchmark in that hot mess of an OS. vs. the linux score is just me running CBR15 along side my browser as a regular process - no nice.


#120

@wendell What distro/kernel and benchmark are you running that you see degradation between 4.1 and PBO in I/O?


#121

Well, the more honest characterization is that AMD intentionally and artificially segmented their market and removed functionality that provides them potentially ZERO cost savings. One can theorize that they bin dies with bad memory controllers for these, but… That’s a shot in the dark and subject to yield at any given time,

Various experiments, thinking of der8aur’s at the moment) have shown that the SP3 and TR4 sockets are mechanically compatible, but use ID pins to differentiate and we can assume AMD has blown one-time fuses on the die as well to ensure they are not 100% interchangeable (though again der8aur’s tests have been able to overclock Epyc, so anything is possible at this point).

People tend to get irritated with someone shoves a dowel in their spokes to make sure they have to pay more for the faster model that isn’t dragging a tire and I can’t blame them. Nor can I blame AMD. By leaving in ECC and 32 cores, they’ve left few other avenues to ensure that Epyc can continue to charge more.


#122

AMD making this eccentric but powerful CPU is awesome. It has to be a linux playground of tuning and tweaks to squeeze out all the performance.

Cant wait too see how it plays out. Im more a Ryzen guy but this tech will trickle down as zen hits 7nm, 5nm maybe 3nm.


#123

Yes, I think this might bring to the fore some of the “BIG-little” issues of ARM phone chips and make that a more normal “heterogeneous computing” feature of the linux kernel. I’d like to see that any way…


#124

You’re saying the extra traces for an 8 channel motherboard are free?

This is a way for AMD to get this out there without sending themselves broke for the people who need EPYC. You can take it, or you can take the $13,000 slower clocked Xeon?

A 32 core EPYC is still under half the cost of a Xeon 8180…