Memory Layout on ECC RAM

You can get spec sheets for hynix ram chips at this link. That can give you an idea of what the mfg rates the chips for and where to start pushing the limits.

3 Likes

back to the actual Rank thing.

Dimms either consist of 8 or 9 chips + the same amount per rank, or double that in case you they used lower bit-rate modules and combine those. Out come those Reg Ecc dimms with 36 + Chips in two rows each side. Those are most often dual Rank but with double the chips per rank.

This particular Dimm looks like single rank to me, just with the chips spread over both sides for what ever reason.

What i find interesting is that i have got 16gb dual rank Samsung ecc that has 8 instead of 9 chips per rank. So the Rule of 9 Chips = > Ecc seems reasonable, though not always true.

@RageBone Is there anything inherently wrong with the layout of my dimm?

So I recently saw this error which I caused by running with my case fans off (may be a fluke)

It seems to have froze after this (wasn’t watching it). The value difference is multiple bits, so it freezing makes sense? Now I need to see a single bit error…

i’m not really the one to judge that.

Though if it is single rank, which it looks to me to be, that wouldn’t be bad.
And Just because a chip is on a different side doesn’t seem like a bad thing either. i mean, the traces and such won’t differ enough to reasonably make this dimm worse then any other dimm.

1 Like

That is definitely different than what I saw. On the line describing the error, it mentioned ecc.

Last night I was playing with cpu core voltage offset, so I didnt play with ram timings to try and get ecc error. I will probably try tonight, and take a picture.

I will have to play around with my system and remember what commands were used. But I think I recall these particular sticks being dual rank.

The chips a 16 gigabit, but a weird stacked kind, of 2x8gigabit chips, I think. So effectively like having both sides populated fully.

I could be wrong in my recollections, it is not unprecedented.

When I looked at the hynix part number, they looked like they were 2 GB dies.

I might return this ram, since the seller has such a lenient return policy. I just feel dodgy keeping something that I can’t verify on a system that I want to run 24/7.

Duhh on my part. That makes more sense!

Dont know where or how I thought they were 16gb chips. Brain fart.

I, and practically everybody would reccomend crucial with samsung B die.

My one consideration if I were in your shoes: would it be easier to verify a kit of crucial B die ecc ram?

I will eventually selling my 32gb, and getting B die, but probably not for a while. I dont really need it, just a computer enthusiast who likes spending hours and hours tweaking and testing.

1 Like

That’s an excellent point. The Crucial sticks mentioned in @wendell 's video will run me an extra 100$, but they are micron dies afaik.

I could also just stick with this ram and go with it… my current plan is to run this machine as a home server/NAS with zfs. I wanted to ensure complete confidence in my ram. Do you suppose it is even possible to “fake” the ECC flags reported to the OS/BIOS? Is that just SPD information?

1 Like

It’s possible to fake the SPD info, but I can’t imagine they’d stick the 9th chip on the stick and not use it. The money is already spent for the extra chip.

Yeah, thats fair. I’m still unsure if I should keep it, as I need to return it by tomorrow to meet the return window.

Well, its your cash, so I really can’t give good advice on that. I was impressed with the timings MtKingsnake got out of his, but every stick of ram is like a snowflake. They’re all unique. :grin:

1 Like

I just want to know that it can correct single bit errors and detect 2-bit errors. That’s all :frowning:

@MtKingsnake did you happen to get a lot of freezes with memtest that report no errors? I am seeing that a lot in my testing.

Not a lot, but some.

I was able to do some fiddling today.

on the memtest screen there is an option to inject ecc errors.

It is the line that say “ecc injection: Disabled”. you can toggle down to it with the down arrow key and the enter key toggles it to enabled or disabled. once enabled it will inject errors at the beginning of each test, it appears.

I couldn’t get memtest to throw ecc errors, besides injection, but my recent tweaking and tuning has been to lower cpu core voltage, and thus heat production. So my system was running substanially cooler as I am now at a -0.9375v offset even at an all core of 3.8 ghz. I can keep trying to get the ram to throw ecc errors, but wanted to show you what I found.

I don’t think that shows that ecc error detection is working:

This seems to imply that hardware injection was disabled on consumer ryzen chips.

I see. you are correct. I just found this.

Looks like if it says it injects, but no error is reported, then the injection function is clearly disabled.

1 Like