Can bad XMP crash software (without OS crashing)

Is this a thing?
When I have had issues it has always taken down the whole OS.

1 Like

I had some memory errors with a Ryzen 3900X running 3,600 MHz RAM on its XMP profile. They seemed to happen in the compiler output and the OS only rarely crashed. The errors also only seemed to happen when putting the CPU under heavy load.

Since it worked fine with the release BIOS and was then broken with a later BIOS / AGESA update I blame it on some voltage setting somewhere. Probably low SoC power. If you remember, one of those early BIOS updates was to fix a Ryzen high voltage issue. I suspect it fixed it too well.

1 Like

Yes, bad/unstable overclock can cause memory errors - those errors don’t necessarily have to occur in the whole address space.
If there this is no ECC memory then they will go undetected and can have various effects:

  • Some of those errors can be in unused memory regions - nothing bad happens.
  • Some can be in kernel/system used regions - will most likely cause a panic/restart/corruption/etc.
  • Some can be in the user program ranges - which may result in bad data or the program can do something that will cause it to be killed by the OS. (for example if a pointer was modified and dereferenced that address could be outside of the allowed bounds - in which case the OS would most likely kill it)

Now if ECC memory was present, working and we encounter uncorrectable error the situation can be very different depending on the given hardware, OS configuration and the affected memory address:

  • Some configs will Halt unconditionally
  • Some will Panic/restart
  • Some will kill the offending user process
  • Some will just try to save logs and carry on.

https://www.kernel.org/doc/html/latest/admin-guide/ras.html#types-of-errors
https://www.kernel.org/doc/Documentation/x86/x86_64/machinecheck

3 Likes

XMP is basically just preset memory speeds(basically preset factory overclocks).
An error in memory usually means really bad stuff happens.
try clocking your memory down a bit maybe.

is your cpu overclocked? seems like it is a cpu problem more then memory.

Oh I solved my problem. I had 64 GB of ECC RAM on order and that’s what I’m using now.

XMP is an overclock. XMP is an Intel thing originally and for Ryzen CPUs the values are translated by the motherboard BIOS. ASUS calls that DOCP.

Anyway on my 3900X it was definitely a CPU problem, but with XMP that is usually what happens. With XMP the manufacturer has tested it and knows it is good for those speeds. But there is NO GUARANTEE that your CPU can handle it.

1 Like

This could easily be the issue I have been having with too frequent Autodesk crashes. I have been running 2 days with XMP off and have not had any crashes. But I haven’t done very complex workflows that are more risky for crashing like booleans in Max.

Its very possible, not long ago i had a system on my test setup that would fail miserably on Photoshop or even Chrome, but would never crash the entire OS no matter what i’ve tried, turns out one of the memory sticks was defective (tested with MemTest86), upon receiving a replacement, the system started to function as normal.
Sometimes it happens with OC as well, which i imagine is your case, i myself did a poor OC on my current memory kit to see if it would run stable at 3400mhz, it could benchmark perfectly for up to two hours, but it crashed GTA V, always at the same spot (the set of bridges right next to the hospital).

I’d say test your memory with MemTest86, if a small number of errors show up, look into adjusting your memory OC, if a lot of errors show up, then one of your memory sticks is defective.

1 Like

It does indeed seem that the G.SKill sniper kit I bought is not actually fully stable with the XMP settings with my Ryzen 3950X setup at 3000Mhz. Even though I have ran Memtest and several benchmarks and games never crash.
What does crash are Maya and Max.
If I was to upgrade to faster ram, do you think 2x 32 would be good idea? Do you guys recommend any memory for Ryzen? I am on Aorus Pro x570 board

Do you think this would work?

1 Like

Patriots have been good for me over the years, of the systems i built none came out defective.
64gb is quite a lot of ram, but the price for 3600mhz and 64gb really isn’t bad (in comparison with my local market, at least).

1 Like

Good to hear. If these would run truly stable even at 3000Mhz for me that would be win. Yeah I think I will buy one of these kits first. and I might need to get another as even 64 isn’t always enough for me.

Should run at XMP speeds, i believe, i think your X570 should be able to do it.

1 Like

That would be great.

Absolutely. I used to have crashes in one specific CoD Warzone mode due to bad memory. It would play well in every other scenario I used the computer in.
That’s an insane level of weirdness, but it can absolutely happen that somethin might not work while everything else hums along just fine.

1 Like

This pretty much…

Nothing is really guarantee to work in regards to overclocks.

1 Like

Very interesting, indeed. I have a gut feeling that indeed the memory clock was reason for the crashes. Have run three days now with no issues with the stock speed.
Also worth to note that I can’t notice any difference running 2400Mhz or 3000Mhz with the stuff I do.

Unfortunately it seems that 3ds Max crashes just the same no matter which XMP settings. Getting the same Unhandled Exception.

Is your memory kit in the QVL my the motherboard manufacturer? If not the auto timings tune the board is doing, especially for secondary and tertiary timings might be the culprit.
If you want to run above stock speed you could try the memory calculator, which should punch in decent enough timings and voltages to make your system stable.
It’s unfortunate not being able to extract as much as possible from the current configuration you have.

1 Like

I have a memory kit that can pass 24 hour memtest86, but witcher 3 would crash within minutes.
I spent days testing with different timings and other bios settings.
Eventually I noticed some cache errors in kernel log , I changed the cpu & reinstalled the os, and it’s still crashing / throwing cache errors.
Computers are hard, and sometimes 2666 is all a system can do.

1 Like

Do you only experience issues in witcher 3 game?
Or also with other tasks/ games / workloads.