Desperate help needed from Threadripper owners

Welcome to the forum!

Thanks for your input. I believe it may be a memory controller issue. I’ll debug further tomorrow, but it feels like a lost cause at this point.

2 Likes

I have read over this and I might have missed something but the imc on x399 only supports iirc 2133 in a dual slot dual rank config?

I suspect boosting the imc/mc.voltage will bring stability around 2133 or possibly 2166.

64gb of dual rank in 4 dimms may be all you can reasonably expect on x399

Actually it was 1866 max official speed. So 2133 for 128gb is probably the ceiling of what you can expect with 1866 being the floor?

2 Likes

Thanks for the tips Wendell, I’ll try that. I did try 2133 MT/s, but that didn’t work out, but never went lower.

Yeah, maybe 8 slots with dual rank might have been overkill. I’ll test tomorrow.

I don’t remember reading anything about x399 only working in dual channel with dual rank memory, but that’s be pretty crazy. I don’t even know if I could find single rank 16 GB quad channel kits, finding that DDR4 kit was a pain as it was.

1 Like

I meant that Dr/dual dimm was a sort of worst case scenario.

If 1866 posts it’s worth upping the imc/soc voltage.

The symptoms fit – sarge’s kit would have been 1.35v. the jedec kit you have is likely 1.2.

Dialing in 1.35v memory voltage and or updating the soc .1v or so might connect the dots here.

At a lower speed ofc. But better than 1866.

2 Likes

Yup upping the soc voltage to 1.1V or max 1.2V could help improve memory stability as well,
when all slots are populated.

2 Likes

Looking at 1950X / X399 & Achieving Memory Speed | [H]ard|Forum I think you might be out of luck with current hardware as RAM compatibility seems to be really poor at best. Given it’s age I wouldn’t be too surprised is if it doesn’t like high density memory chips.

2 Likes

Reporting in.
> booted PC today
> all 128GB detected on first try
> rebooted just to make sure I’m not insane
> 96GB detected
> messed with timings and voltage
> got 1.25v and 2666 and the JEDEC specs shown on the DRAM timing settings
> got all 128GB detected
> reboot
> 32GB detected
> mess with frequency
> go 1886 and 1.3v with same loose timings
> 64GB detected
> set all to default / Auto
> enabled TR Advanced Boot Training
> all 128GB detected (of note, this was with 1.2v, 2666 and 18-19-19-19-43-61 timings, default for my kit)
> reboot
> 96GB detected

Keep in mind guys that my system does not fail to post. It did fail to post a few times when testing other timings and frequency, but it got back to the previous conf, I modified the settings to new ones that worked.

But I have to insist, all the 8 slots are being detected, just not the capacity. And when I have only n number of slots populated, only those slots are being shown, so the motherboard does see the RAM sticks, just that it ecides to not use them.

2 Likes

Another update. After using only the 2018 kit on slots 1 only (as I had them), I was failing to post with memory issues. This was expected though, as you are supposed to only use slots 2 for a quad channel kit.

Swapped them around to slot 2 and after more than 10 reboots, I get consistent RAM detection (64GB). I thought maybe RAM closer to when the TR was released / supported might work better with it. Seems like I was right. So now I am starting to believe something may be wrong with the 2020 kit. They are the same model and everything, just a bit newer, so who knows what manufacturing defect was introduced along the way.

I will test with the 2018 kit in slots 1 and 2020 kits in slots 2 next and see if I can get even a glimpse of consistent memory detection.

2 Likes

Do both kits share the same amount of chips and layout? You might also be looking at some SPD info differencies which confuses the memory controller but I think it’s just unforuntately boils down to poor compatibility.

1 Like

Yes, but they’re OEM only and crazy expernsive…

2 Likes

They have the same amount of chips, size, frequency and all, with the only difference that I could spot in the DRAM Information being that the 2018 kit had tRRD_S = 5, while the 2020 kit had it = 6. All the other numbers were the same.

Seems like with just quad channel with the older kit, I get proper detection.

I guess 128GB was overkill anyway, in all honesty, 32GB might have been enough and 64GB should be plenty. Just wanted to use the big capacity for ZFS,

By returning the 2020 kit and only using 64GB. I can’t win all battles, especially if I don’t even have a way to actually debug what is happening with the detection.

5 Likes

Yeah, you probably need to settle with that amount using that specific setup.

2 Likes

Even more updates and an abrupt derail of the thread.

So, being satisfied with the fact that I got 64GB going, I went ahead and installed linux, root on ZFS, set stuff up, installed some useful utilities, among them dmidecode to check the RAM. To my shock, the kit only has 64bits, meaning it is not ECC.

I’ve bought the RAM from Newegg. When talking with Sarge, the offer seemed very good. I was looking through a filter with ECC enabled, this kit showed up and so I bought it. I feel so embarrassed and bamboozled. Searching one stick’s part number online, I can see that it is in fact non-ECC desktop RAM.

I’m so utterly pissed right now that I almost feel I should try to unlock my Amazon account and look there. And I hate Amazon, so you can imagine how annoyed I am right now by the fact that the filter for ECC did not work properly on newegg.

Returning both kits. Fugg this.

@wendell do you happen to know what kind of memory I should look for if I want to populate all 8 slots? Should I look for 1R 2667 MT/s 16GB sticks? According to dmidecode, the system supports up to 512GB of RAM. I personally don’t care for more than 128GB, which is already plenty, but I really appreciate you advice.

[oddmin@biky-tr ~]$ doas dmidecode -t 16
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 3.1.1 present.

Handle 0x0008, DMI type 16, 23 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: None
        Maximum Capacity: 512 GB
        Error Information Handle: 0x0007
        Number Of Devices: 8
1 Like

Ah it’s just 256 not 512. 512 only works if all 8 chBnels Re wires in (they aren’t).

Fwiw? I’d probably get rome or Milan. Regecc will be cheaper and more better. I might even have a decent p series Rome CPU here somewhere

2 Likes

According to Kingston these should work ™

I had a look at Crucial (Micron) but those are quite a bit more expensive from what I could tell

You can probably use something like Google Shopping to track down the best pricing on these

Always verify part number and UPC number (if possible) when using generic filters

2 Likes

Zen 1 and 16 cores is already past overkill for what I need, keep in mind the system is going to be idle to 50% utilized most of the time. I’m not planning on having many VMs on it, this is going to be mostly a VFIO build with a few VMs for some random testing. And being TR, I don’t intend to run it 24/7.

1 Like

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.