Hello everyone. I am at a loss on what to do. I have my Unraid server on a Asus Motherboard and the temperatures on the m.2 of the DIMM.2 slot is getting warm randomly. I have as much airflow as possible with the fan configuration and it is only that one. It has a heat sink as well. Looking for any recommendations. Water cooling is not an option. I did not have this problem with my last motherboard that also had the dimm.2 slot that I can remember.
Would you mind telling us which SSD you use and what temperatures you see which make you concerned?
Sorry about that. I should know better. Here are two drives. I just noticed another one getting hot as well.
SanDisk Ultra 3D NVME ~50c
Samsung 970 EVO ~55c
-This drive is on the motherboard itself and is under a large heat sink.
I saw that some peoples drives get higher than that but I do not know if I should be concerned.
I don’t think you should be concerned at all with those temperatures. NAND memory likes temperature, especially when writing to it. The only component that might have problems with temperatures is the controller, that’s basically a CPU. But your drives are nowhere near throttling on the controller or unsafe temperature for the NAND memory itself.
If you’re still concerned I think that the thermal pad might be a bit too thick and it’s making the drive bow a little bit. This is reducing the contact with the heatsink and make the drives run hot. To verify this take the DIMM.2 out and check with a flash light. If there’s light coming through you only need to buy a new thermal pad.
I feel like I need to ‘nip this one in the bud’ because it perpetuates an incorrect and potentially detrimental myth. NAND flash will wear less when hotter ONLY WHEN WRITING. Also the only evidence I have ever found that supports this is an article from back before NVME, and on SATA SSDs the guy found that writing to the NAND flash at 25C as opposed to 50C could reduce max TBW by up to 50%.
Here’s the thing though, with most modern drives and a properly setup OS, the average user is never going to even come remotely close to max TBW. Not to mention the NAND flash chips themselves will heat up whenever electricity is sent through them, which is probably why it is less of a concern on more powerful NVME drives. So in the end you probably won’t see any real difference in longevity by intentionally making the drive run hot.
However, by neglecting to cool your NVME drives, you are also increasing the risk of components other than the NAND flash failing. The controllers on NVME drives produce a lot of heat and this increases with each generation. You shouldn’t really run a 3.0 drive without a heatsink, but I would say it is pretty much a requirement for 4.0+. Plus, the heatsinks work both ways, so the heat from the controller will actually transfer to heat the cooler NAND chips whenever you start doing something.
And if that isn’t enough to convince you that the myth is bullocks, the data retention for NAND flash is much more volatile at 50C as opposed to 25C. For the best data retention you would want to keep the drive around room temperature for anything other than writing.
LTT actually had a several year old video where they bring this up (and either them or Gamers Nexus referenced the original “study”), but then last year LTT actually snuck in a redaction for the statement into one of their newer videos because they couldn’t verify any of their claims with the manufacturers.
tl;dr - NVME drives come with heatsink for a reason, use them.
Reading doesen’t affect the state of the cells inside the memory so it doesen’t really matter the temperature, up to certain point, when doing so.
Data retention is referred to a power off state, when the cells inside the memory are not put under voltage to keep their state. It just means that if you want to keep data on an SSD for a long time you better put it in the freezer, else you’re gonna need to hook it up to a PC from time to time. That’s about it.
Sure, that makes sense. But the OP posted temperatures absolutely safe for almost any piece of silicon that’s in consumer PCs today. My assumption was based on that for both NAND temperatures and controller temperature. There’s also to keep in mind that the temperature reported by the drive is the controller temperature and not the NAND temperature, which is most likely lower than 50°C.
I don’t know about you, but drives usually get moved from system to system because they’re always useful. So I see someone using an SSD till it’s dead. Sure they heat up, but in this case, I’m just going off of what the OP reported. Don’t know if that’s under max workload or idling. I didn’t talk about running the drives hot on purpose.
That’s not true, not all drives come with heatsinks. Most are even mounted under GPUs and surely get heat up quite a bit when the GPU is working. And that’s due to motherboards layout, nothing the end user can do about it.
If they did not have a heat sink when you got them I could be worth trying them with out them and see if the heat is less long term and more momentary.
There were quite a few reports of motherboard m.2 slots with heat sinks caususng drive to get too hot as the heat had no where to go. It went into the heat sink, heatsoaked, and then kept the drive hot rather than taking the heat away.
But like almost everyone here has stated, this is fine. I would not worry at all. Typically danger temps are up at 95-105°C, though again the controller is more sensitive, but from what you have said all looks okay.
You may not have meant it, but that’s exactly how it will be received by someone when they’re missing all of the nuance from the discussion. I’ve seen this specifically parroted on places like reddit many times. Let me put it a different way, please state the reasons why “NAND flash likes temperature” other than the already mentioned write operations, of which I would also like see substantiated with a study on modern NVME drives, like what that wear difference actually is and if it is even a concern overall). I’m not trying to attack you or your statements, so no need to feel defensive. It’s just that this is a topic that I have seen misrepresented by a few others, so I apologize if my response was overly abrasive. I was incorrect about the data retention part though, you are right in that it is power off state where it is affected by temperature.
From what I have seen, most SSDs list a safe operating temperature of 0 to 70C with some NVMEs going up to 85C. You’re right in that it is the controller that is more sensitive and the main concern, so that’s probably what they use as their metric. The safe storage temperatures should also extend out a little further in either direction as well.
Not all NVMEs or Mobos may come with heatsinks, but I wouldn’t run one without it. I’m also more than slightly obsessed with airflow and thermals FWIW.
As a reference point (@17C ambient), my 970evo+ NVME uses the motherboard heatsinks on the Z490 Unify and it idles at +18C over ambient and has never exceeded +25C during heavy workloads. This is in the primary m.2 slot behind the GPU and during gaming it will also run at about +25C. I have an 860evo in the same system and it will idle at +8C and temps will only increase by a few degrees under load. I would consider this to be the most optimal airflow environment possible, so these temps are definitely going to be on the low end compared to most.
That’s asbolutely reasonable. I just brushed it off like an absolute truth without going into the details of it.
Well NAND memory is always the same layout internally for the most part (if we’re not talking about Intel Crosspoint NAND storage) so it doesen’t really matter how the controller interfaces the memory with the system (SATA, PCIe, etc.).
The impact of temperature on NAND memory is basically restricted to write operations, just for the fact that a drive just does two things: read and write. Reading doesen’t impact the state of the cells so there isn’t a temperature threshold that might impact reading, keeping the memory in the specified temperature range defined by the manufacturer.
Writing is impacted by temperature because, due to the way transistors are made, the amount of thermal energy that’s already available inside the silicon makes the transition of electrons towards holes (and vice versa) easier through the potential barrier inside the junction. Too low of a temperature and the transition is much slower, requires more time to switch the transistor from one state to another, applying tension for a longer time and degrading it in the process. Too high of a temperature and the flow of electrons is not controllable anymore and the junction destroyes itself becoming electrically neutral, so incapable of being into one defined state.
This is a condensed version for people that don’t want to go through, at least, three courses of applied electronics like I did for university hahaha
Your apology is accepted. I did my part in misunderstanding the tone of the reply and I’d like to apologize too for being so defensive. We’re good in my eyes!
Those are incredible temperatures! My Sabrent Rocket 3.0 on a SoDIMM.2 with basically a fan pointed to it idles at 28°C in a 20°C ambient temperature. I swear I’m not point an heat gun at it from time to time hahaha
I wish I had those temps for my NVMe SSDs, mine are in idle at least 40°C, I have three.
My one Corsair Force MP500 never goes under 50°C but even under load that doesn’t change much, goes up to 60°C and then almost stops at that temperature. I got it like that, never changes even with good airflow over it.
My small Samsung OEM (First Gen) is the coolest, even when the OS runs on it.
Samsung SSDs seem to keep their cool better than others from what I observed, from all the manufactureres that I had, which are quite a lot now that I think about it…
This right here is the type of discussion I joined for (if this goes too far off topic hopefully a mod will tell us to get a room). When I first heard someone say “NAND flash likes to run hot”, it didn’t sound right to me as I can think of very few instances where you wouldn’t want to run your components as close to room temperature as possible. But there wasn’t much information on the first dozen pages of google and the best I could track down was a 2015 article from a guy who tested NAND flash in 2.5" SATA drives and he was reporting up to a 50% reduction in max TBW by keeping the chips at 25C instead of 50C. A few years later Gamers Nexus referenced this first article and then a few years after that LTT did the same, but the latter ended up making a redaction in a more recent video because they couldn’t get any industry sources to verify anything either way.
I’m not trying to debate the fact that NAND flash wears more at lower temperature, but what I would really like to know is to what degree is it increased and if that that would actually make a difference in the majority of use cases. Also whether or not this problem even exists in modern NVME drives as they are much faster and create more heat in general. Like sure the chips can sit at 25C when idle, but I imagine writing at 5gig/sec would have to heat them up to at least somewhat close to optimal temperature.
I can think of a couple possible scenarios; first is that it isn’t a problem for modern flash or at least not to any degree of concern, second is that it is a problem but the industry is keeping quite in hopes that no one notices, and the third is that there is an industry-wide conspiracy to push unnecessary and potentially detrimental NVME cooling solutions in order to sell more NAND flash. My gut tells me that it’s either the first or second because I don’t think the industry as a whole is competent enough to pull off such a scheme, and even then I’m still leaning heavily towards the former. Or maybe we’re on the verge of a NAND flash failure epidemic as the prevalence of consumer SSDs is more than a decade old now.
They are only faster as long as there is cache available, if you’re writing to the NAND (TLC, QLC) directly then it’s slower than even most HDDs. TLC is right at the cusp of being faster than HDD when writing directly to it, QLC is slower when written to directly.
Well the TBW will go down and down and down, I mean QLC already has about half of TLC and PLC will have even about half of QLC. So they will more likely fail faster and the future.
Thank you guys for all the feed back! I do not think I will ever reach the drives tbw even if I am using it as a cache to write to other drives. I am currently only using it to back up stuff and running docker on it.
This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.