[SOLVED] 5995wx on ASUS WRX80E-SAGE with 1TB memory installed goes into an infinite Q-Code loop

dahlia123 · January 9, 2023, 9:07am

Hey all,

I have a Threadripper Pro 5995wx build with 1TB of RAM (8 x Samsung DDR4-PC25600 128GB ECC-REG), and goes into an infinite Q-Code loop when all 8 dimms are occupied.

CPU: Threadripper Pro 5995wx

MB: ASUS WRX80E-SAGE SE WIFI (Bios ver: 1003 newest)

RAM: 8 x Samsung DDR4-PC25600 128GB ECC-REG (1TB in total)

PSU: Micronics 1050W

GPU: GTX 1070

When I put all 8 x DIMM sticks in it, it won’t post and goes into a limbo of an infinite Q-Code loop: 20 - 44 - 01 - 66 - dE - Ad. The LED indicator shows white light.

A very interesting thing is that, if I remove any 1 of the 8 dimms, it successfully boots. Literally, with any combination of 7 dimms sticks and 7 dimm slots, it boots.

The followings have been tested so far:

I have a second build with 3995wx: all other parts are exactly the same. Only the CPU is different. This system works just perfectly with 1TB of memory.
I swapped 5995wx and 3995wx on the two systems. Powered on, and the system with 5995wx fell into the same Q-Code loop. The system with 3995wx worked perfectly. Therefore, I think the issue is not related to the dimm slot faults.
With the 5995wx, I put spare 8 x Samsung 32GB DDR4-pc25600 dimms (256GB in total), and the system worked with absolutely no errors. This means that the 5995wx doesn’t have an issue with the eight-channel memory configuration.
Changed GPU with GT1030, G710 etc, but there was no difference.
Change the memory clock to 2666 MHz, but didn’t work.
Changed the 1TB remap option in the bios, but no change. (I don’t know what the option means)
Cleared the bios, and it was not helpful.

I’m stumped on this one because it’s so unusual that 7 sticks of 128GB RAM work perfectly but 8 don’t. Everything works. But eight sticks for 1TB? it’s a limbo of 20 - 44 - 01 - 66 - dE - Ad!
I’m currently using with 6 sticks as AMD recommends here (https://www.amd.com/system/files/TechDocs/56873_0.80_PUB.pdf).

Should I wait for a new bios? I’m seriously considering purchasing a new MB (other than the asus wrx80se-sage).

Please, any suggestions? Thanks!

ryandotsmith · January 14, 2023, 7:39am

Hello! I am having the same issue.

ASUS Pro WS WRX80E-SAGE SE WIFI
AMD 5975WX
8x M386AAG40BM3-CWE 128GB DDR4-3200 LRDIMM by NEMIX RAM

I get the same loop of MB codes. Removing 1 sticks enables the system to boot and run without errors.

twin_savage · January 14, 2023, 8:27am

Perhaps you got a 5995wx with a weak IO die? I wonder if you’d see the same results using register memory instead of LRdimms. Ironically, from my experience LRdimms stress the IMC more than just registered double ranked memory.

dahlia123 · January 15, 2023, 3:25pm

Thank you for the reply!
I’m currently using Rdimms. Anyway, it is interesting that LRdimms stress the memory controller more than normal Rdimms. This is good information for me.

dahlia123 · January 15, 2023, 3:32pm

Hi,

You are having the same Q-Code loop with 5975wx and 1TB memory. Now I suspect that there is a bug in BIOS 1003 that prevents the 5000wx CPU from working with 1TB of memory.

twin_savage · January 15, 2023, 9:55pm

ahh, I had assumed you were on LRdimms with such high dimm capacities.

You’re probably right about the bios being the issue, Asus is still on AGESA ChagallWS 1.0.0.1 with the 1003 bios while Asrock has moved on to AGESA ChagallWS 1.0.0.5 for it’s TR pro MB (or 1.0.0.4 for the non-beta).

ryandotsmith · January 16, 2023, 4:34am

Yes. Same exact error code as what you reported. I should also say that I’m running BIOS version 1003 as well.

ryandotsmith · January 16, 2023, 4:38am

I’ve spent quite a bit of time searching to see if anyone has mentioned running 1TB ram, asus sage, TRP 5XXX and I haven’t found a single result. I’m a bit worried that this MB/BIOS combo is unable to support 1TB.

dahlia123 · January 16, 2023, 9:42am

I was also looking to see if anyone else was having this problem, but I couldn’t find. Thank you for responding!

Because I’m currently using the same MB + 1TB ram with 3995wx without this issue, I believe this occurs only when 59x5wx is used together.
Anyway, now that I’m convinced it’s a bios issue, I’ve sent an e-mail to ASUS to inquire. I’ll keep you updated as soon as I receive a response from them.

dahlia123 · January 16, 2023, 9:47am

Yes, the most recent 1003 bios is about ten months old. I’m hoping ASUS will update the bios soon.

Thanks!

ryandotsmith · January 16, 2023, 5:42pm

Wonderful! Thank you for doing that. I’m eager to hear back.

In the meantime, I am thinking of exchanging my memory, I noticed that on the ASUS QVL they only list Samsung M393AAG40M3B-CYFCQ for 128GB sticks and I am currently running M386AAG40BM3-CWE.

trx80a · January 19, 2023, 5:27pm

Okay, count me in with the same issue! Here are my specs:

Motherboard ASUS Pro WS WRX80E-SAGE SE WIFI (latest BIOS 1003)
Ryzen 5995WX
8x 128 GB DIMMs Samsung M393AAG40M32-CAECO (ECC Reg) 3200

Absolutely the same symptoms with all 8 DIMMs: white diagnostic LED with cycling Q-Codes of 20 - 44 - 01 - 66 - DE - AD. The system boots with 7 DIMMs, and even with 7 128 GB DIMMs plus one 64 GB DIMM! However there was no dice with 8 128 GB DIMMs.

I tried several different combinations of memory slots and DIMMs to rule out any single point of failure - the behavior was the same every time. It seems like when the full 1 TB of RAM is filled up, something in the BIOS (or in the CPU memory controllers?) bugs out.

I also tried increasing the voltage of DIMMs, of the CPU SOC, increasing the load tolerance of the memory VRMs, decreasing of the switching frequency of the memory VRMs, decreasing the memory speed to 2666 - nothing helps.

While changing the memory modules, I incidentally failed to push one of them completely down, and I saw what a real memory error looks like: yellow/amber diagnostic LED and Q-code C5 with no further activity and no boot.

The system works fine with 6 modules (in 6 channel mode) but the memory read speed drops from 190 GB/s to about 140 GB/s in AIDA64 (compared with 8x 64 GB DIMMs). That’s not a big deal but I really need full 1024 GB RAM, and this is not good.

Now I’m back to using 8x 64 GB Samsung M393A8G40AB2-CWEBY, which work absolutely fine in 8-channel mode. Really hoping to get a new BIOS from ASUS.

ryandotsmith · January 24, 2023, 6:24am

I wonder if we need to use a 1TB kit instead of 8x128GB individual sticks. I know that the memory kits can sometimes be different than individual sticks.

ryandotsmith · January 27, 2023, 12:20am

@wendell I know you like this motherboard, and I got excited by your video on it, but you may want to recalibrate your recommendation based on the fact that you can’t use 1TB of ram and have it “just work.”

SWPadnos · January 27, 2023, 12:52am

My hunch is that it wouldn’t matter. Given that @trx80a had things working with 8x64G DIMMs, I’d lean towards some kind of BIOS bug. When the memory count needs to use bit 30, something croaks. That’s the highest bit in a 32-bit signed integer (since bit 31 would wrap around and be negative). I wouldn’t be surprised if some 32-bit legacy code in the BIOS memory check is using that bit as some kind of error flag, which causes the boot loop.

That’s all supposition about a BIOS bug, but given that there’s an example in this thread of all 8 DIMMs working, but not with 1024GB total, it seems like getting new memory isn’t likely to help.

wendell · January 27, 2023, 2:06am

its likely a bios bug, there was a bios bug around the nvme support as well… the contacts I had at asus are gone or don’t respond because, I guess, level1 is too fringe or whatever.

ironically the 3000 series cpu with 1tb and oldddd bios for this board does work I think. You may be able to change the address bitmask to get 1tb to work since the motherboard tries to put stuff in memory mapped io at the 1tb border and this bug smells like that’s what its doing. tbh surprised asus support hasn’t emailed you a bios with the fix?

ryandotsmith · January 28, 2023, 3:37am

I went ahead and ordered 8x of Samsung M393AAG40M3B-CYFCQ. Unfortunately I get the same error codes.

I spent some time today talking with a supervisor (for whatever that’s worth) at Asus and I submitted a bug report to their technical team. I’ll report their reply as soon as I receive it.

dahlia123 · January 28, 2023, 9:54am

3000wx doesn’t have this issue. I’m currently using two systems:

5995wx + wrx80e-sage + 8 x 128GB + bios 1003 = Q-code loop
3995wx + wrx80e-sage + 8 x 128GB + bios 1003 = works perfectly with no errors

as I posted. Both systems have the same memory modules (Samsung M393AAG40M32-CEC0 PC25600 128GB ECC-REG).
On 3000wx, no need to use old bios, because 1003 works perfectly.

dahlia123 · January 28, 2023, 10:20am

It’s too bad that new modules in QVL don’t work as well.

When I tried to contact the ASUS tech team via the website, I was asked to enter the serial numbers of the MB, and my report was routed to the distributor rather than ASUS (I live in South Korea). The distributor in my country said that they must conduct an internal test to see if the same error occurs, and then report the results to the ASUS technical team. Although I understand their policy, it is clearly inefficient. I asked if I could get the tech team’s e-mail address so that I could contact them directly, but they refused. They simply said that they WILL conduct the test and that it will take some time.

Thank you for reporting to ASUS, leaving comments, and sharing the test results. I’m hoping to hear back from ASUS and they will out new bios soon.

dahlia123 · January 28, 2023, 10:50am

Thank you for your feedback! Ironically, I’m relieved to see another guy who is experiencing the same error. I hope ASUS recognizes this problem soon.

In terms of memory read speed with 6 or 7 modules, I ran a test using COMSOL Multiphysics. The test consisted of solving electrodynamics problems for two different scenarios, and the results were frustrating. Thank you so much for informing me of this important fact. I really appreciate it!

Test 1 (memory occupation: ~6 GB)

5995wx + 8 x 32GB (256 GB in total) = 18min 32s
5995wx + 6 x 128 GB = 18min 29s
5995wx + 7 x 128 GB = 18min 54s

Test 2 (memory occupation: ~200 GB)

5995wx + 8 x 32GB = 39min
5995wx + 6 x 128 GB = 10h 50min
5995wx + 7 x 128 GB = several hours, and canceled the test
5995wx + 7 x 128 GB + 1 x 64 GB = several hours, and canceled

I was shocked by this result and immediately went back to 8 x 32GB although this strongly limits the computation capacity.