Consumer SSD reliability

Is there any good recent data on consumer SSD reliability? Historically I’ve just gone with Samsung Pro SATA drives, but the recent 990 pro debacle has me looking to re-evaluate my standard approach to purchasing.

Wendell mentioned that the Solidigm drives were especially good for Linux machines, though he didn’t say why. Is there any known data about the reliability of those drives?

And the 980 Pro debacle. Samsung has a good track record all the way back to 840.

Question is what you mean by reliability? Good firmware? TBW/DWPD? PE Cycles? Failure rate of entire disk? sustained write performance? Long-term performance?

1 Like

I honestly think its a bit of a lottery at this point, recently i had many client NVME drives simply die upon a few hours of usage, from many brands, i even had a Netac of my own die within the first hour of usage.
That said, i think Crucial is one of the better ones, on the lower end i never had issues with Hikvision as well, all of the ones i had for myself and installed for clients have been rock solid, even my DRAMless Gen4 drive from Hikvision hasn’t skipped a beat.

2 Likes

Unfortunately I would agree with a lot of what is said just above. Depending what you want to go into technically, the higher abstraction level is, look at what warranty they offer at sale, and weigh if the manufacturer are a big enough operation to still be likely to be around towards the end of it. That is if you care about the data, and it’s not about a $15 upgrade for some business desktop.

You do have to consider the drive wear, in relation to application. I weigh this resource for specifications highly

Unfortunately, even the big ones (micron/crucial) do internal parts swaps on the consumer (i.e. uninformed purchaser) lines.

There’s definitely days it’s easier to pick a box of chocolates.

I don’t know. The whole video was about teaching the audience that they need custom drivers on Windows for Solidigm drives.
Then it was mentioned that the drives are good on Linux. Call me skeptical on that comment.

Probably has more to do with having a large linux clout in the community, as a sales pitch. Benchmarks on Phoronix don’t seem to indicate any noticeable difference from other drives. And the GUI tool isn’t available on Linux, so Linux remains 2nd class customer for Solidigm.

I think the MFU-like feature is nice as an innovation, but I don’t like vendor-specific features that require additional software that may or may not be supported in the future.

1 Like

If concerned on longevity, of your SSD
Either the TBW/MTBF and Warranty window, could be investment cushion(s)
Always possibilities of going well past those limits or having the device brick on arrival

My main exposure, has been TeamGroup [PCIe 3.0], with smaller drives [at 250<]
Never had an issue, but they certainly have a lot of rewrite support [OS Boot]

My main concern is inability to retrieve data from the drive. I’m okay if the drive loses performance over time or even ceases to support writing altogether as long as I’m able to extract my already-written data from the drive for a while.
I run my drives in a RAID-1 configuration. Long-term write performance bothers me less as I will likely rotate in newer, higher-capacity drives before the wear cycles are exceeded. But I really don’t want to lose my data. (Yes, I have off-site backups, but that’s a really big pain).

Well if they die, they die. You really can’t retrieve much after unless you have specialized tools and act fast. Just get one with a god warranty and high TBW ratings. Backups can be a pain, but really are necessary.

1 Like

RAID-1 is more protection from random failure or wierdness than any manufacturer can ever promise. I personally never had an SSD failure (I still have old 120GB crucial drives with troublesome SMART values running (in RAID1)) but going for two cheaper drives instead of one very expensive is the approach I usually take. You still get 2x read speed on mirroring after all.

Single reliable drives…I’d check on enterprise drives. Business customers don’t accept bullshit, consumers often do. And with this in mind, products are made and sold. There is no QLC, low-TBW or “tweaked” write IOPS in enterprise-land.

I bought some 2TB Kingston FURY Renegade a couple of months ago. Seem amazing so far, probably the best NVMe I ever used. But if one dies, I don’t care because redundancy, backup and warranty. If the wrong drive dies and I have to remove GPU to get to the M.2 slot, I’ll be furious, but that’s about it.

QFT. Having a HDD to backup all your SSDs isn’t overly expensive. Backup was more difficult and more expensive in the past than it is today

Relevant thread to check out:

I can’t think of any legitimate reason a flash drive can’t indefinitely enter read-only mode on NAND failure. It’s a designed suicide of the controller on the first sign of NAND failure afaik.
Maybe I’m missing something, and a failed write leads to a short to ground that prevents the flash from being safely read, but… given that some drives give you a warning when encountering a write failure, and let you have until the next boot cycle to back up what you can, I find it doubtful there’s any real excuse for it. It’s just more malice towards technology consumers.

Paging IOT’s

Honestly I understand the argument from “innovation” when it comes to proprietary software but lots of companies treat their features like novel molecules developed for a PHD thesis, one and done while pitching it as a “must have”.

1 Like

There isn’t any real hard databases of consumer SSD reliability as many stick with high endurance(TBW) geared drives if they plan to hammer the drive for work(content creator) or gaming. Generally the big five SSD makers are very equal, reliability really falls upon the controller being used. In my experience there isn’t much of a difference between Phison vs Marvell, compatibility and performance falls upon the SSD makers’ firmware. Most consumer SSDs of the 240-500GB range tend to have a 3yr(or 300 TBW) warranty, however TBW doesn’t matter as some brands just bin slower flash chips and the actual life could be longer.

As far as using consumer SSDs in an enterprise environment, far too many consumer driven models are designed with “end user” tasks which is why with enterprise there are different versions of drives such as “mixed/hybrid usage”, performance or read optimized. For example when databases are considered, read optimized is more ideal but mixed/hybrid SSDs tend to be a better choice if the database is heavily shared or written.

1 Like

I have only had one SSD ever die on me, it was an old cheap SanDisk 120GB drive I threw into my Mac Mini for an upgrade, and then let the Mac run 24/7 for some years as a file host.

Since then I am not sure I have even owned a SSD longer than its stated lifespan warranty.

if lifespan is a concern then yea you can just grab some decent condition enterprise drives from eBay, and use a U.2 adapter in your PC.

someone will have to remind me the details of the “Samsung 990 Pro debacle”, guess I missed that one. But I also have not even bothered to upgrade to PCIe 4 or 5, all the consumer drives I end up buying are still PCIe 3, for what its worth.

Basically, samsung drives were committing suicide at a pretty alarming rate. It’s apparently quite permanent damage that a firmware update to make it stop getting worse.(as quickly)
It sounds to me like the controller was chewing up the nand in an unreasonable way, and the firmware update fixed it so that it doesn’t chew the nand as hard, but given the sparse details, who even knows.

The paranoid spike proteins clogging up my brain are whispering that samsung made a firmware that was designed to fail the drive just outside of warranty, but messed up a tracking value, making it go off the rails early. That, or it was their nand endurance testing firmware branch shipping on retail models. Pure paranoia though.

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.