Hello all. I have a home server with 2x1TB NVMe ZFS mirror and 2x8TB HDD ZFS mirror. The NVMes for VMs and the HDDs for data. I want to change the HDDs for a 3x4TB RAIDZ1 SATA SSD setup. Is there any downside to this? Is the data secure long term in the SSDs?
You can buy ā¦ well I can buy ā not sure about the situation in another countrer ā SATA SSDās with 5-year warranties, so I would trust I can replace the Inexpensive Disks from my Redundant Array of Inexpensive Disks. It will mean changing a drive on failure and chasing a supplier RMA, but I think itās doable.
Solid State devices use less power, run cooler and the configured redundancy should keep your data safe. Check your daily written volume and examine the specs on the ones you want to buy so thay you wonāt exceed the flash cellsā lifetime write limits ā even reading cells can disturb some multi-level NAND cells.
K3n.
Is the data secure if it is written once and not read/rewritten in a long time?
SSDs need power to keep the data on the drive. As long as they are plugged into power once a month or so, they should be fine.
Scheduled scrubs would take care of the āread everythingā, no?
So far, untested.
To refresh cells requires not just a read/patrol operation (as done in scrubs) but also a write operation (only done during a scrub if data does not match metadata and is available elsewhere in the array)
Ideally the data is read from disk into memory, then written back, then read again with checksumming at each step to maintain integrity.
Youāll add 1 full disk write every cycle, but would keep the data fresh and with modern SSD write endurance: should not be a problem unless ran daily.
Iāve been using 6x4TB SATA SSDs in a RaidZ2 pool and Iāve not had any issues so far. The pool is getting scrubbed regularly and I havenāt lost any data. The system is on 24/7 so the drives also have their own logic to refresh the cells. So no, I donāt think there are any downsides.
Iād be concerned if you told me that you need to re-write the whole pool over and over as often as possible.
The only downside is monetary, in my opinion. SATA SSDs in big capacity and from reliable brands have gone up in price since last year so itās gonna be a bit more expensive to pull off.
I echo the same sentiment.
There is still some research being done on cold storage and nearline storage reliability for SSDs, but for online storage, the only downside is it costs a bit more. At the same time, itās not unreasonably expensive with a 3x4TB setup these days (~$650-$700 with modern pricing).
Just make sure you go with TLC, and beware the limits of the SATA interface. If you can try to migrate to NVMe storage.
The system is now 24/7 on, but with the migration to SSD, I maybe start to only being on half the day and shutting it down in the nights. The SSDs are 870 Evos, so TLC. I will do the change and see how much I can lower down the power consumption. Thanks!
Short shutdowns is fine, itās when an SSD is unused for months and years it might become an issue. And the Jury is still out on that one.
Also, if you want to have something only running when required, look into IPMI. If your server does not support that, you can jank-DYI with an SBC like the Raspberry Pi:
Iām fluffing around and finding out for myself.
One curious side effect is how ZFS favors faster vdevs.
This may not be what some would want for SSDs but even with āidenticalā drives on the same HBA its not an even distribution of data. One vdev fills before the other here. This is repeatable for me. I synced this data twice just to make sure.
It shouldnt actually be a problem but it certainly triggers my tisms.
I should have elaborated
was in reference to maintaining 100% performance on flash when data sits unused for extended periods (months to years) and the drive is running frequently to constantly.
The only way to regain the lost performance from the data stagnating is to refresh the cells with a read & write operation.
I run flash based servers personally and do not worry as my systems are so much faster than HDDās, my limitation is network bandwidth.
100 gigabit is sufficiently saturated and 200 gig is both too expensive and generates too much heat for me to bother at this point.
Iād say that the other concern is the wearing out of SSDs, but you might book that under ācosts a bit moreā as well.
Iām new to all this, but from what I have read is that the write amplification that file systems like ZFS have is a concern. Things I would consider:
-
Using an SLOG on a separate high-endurance (pair of) SSD, if you have many synchronous writes. Thatāll save plenty of writes from the main storage pool.
-
Using an āappropriateā record size. Not sure what a good one for SSDs would be, but I suppose it would be much smaller than 128k. My understanding is that on an SSD data is written in 4k blocks, so it might be as low as that, if you want to minimize writes. Or maybe something like 16k?
Does that make sense or are those concerns unfounded?
While it is true that some SSDs are worse when it comes to wearā¦ With TLC drives, unless you are doing full writes all day every day for like 2-3 years, it is not an issue. For QLC that window shrink to 18 months, but again, this assume you write to the whole drive at least 20-40 times a day.
Most storage doesnāt use their SSDs like that.
Really? I thought that powered drives would refresh the voltage level of each cell if it dropped below a certain value.
I know itās not really possible to know if this is true but they should do that.
My experience has been positive so far, never seen dip in performance after multiple years with the same data on a drive.
I would test these theories, but I aināt got time nor memory (the irony) to remember.
My only guess as to why virtually no consumer NVMe does this is that the manufactures are worried about burning up P/E cycles on the drive for perceived little gain.
Itās fairly easy to observe the aged data read speed degradation almost all consumer NVMe drive experience, you can run a surface test of the drive and there will be sectors that are significantly slower than others (1-2 orders of magnitude). The slow down seems to start being noticeable 6-18 months into the data age on most drives, even sooner if the drive has wear on it.
Soooā¦ a single refresh every 6 months or so should do the trick, is what you are suggesting? Just like defrag once did it for old FAT drives?
Probably even less often would be fine for most drives, perhaps more like once a year.
What I wish we could do as a community is identify all the drives we know benefit from the process so no one needlessly does it on a drive that does have the charge refresh algorithm implemented (like *most of the MX500s).
I had (wrongly perhaps?) assumed that all SSDs had some charge refresh algorithm implemented. Could you point me to further info on this, please, since I wasnāt able to find anything on this.
ā¦well technically there are some charge refresh algorithms that are almost universally implemented on SSDs like a read disturb refresh (actually some low cost SD cards donāt even have this refresh mechanism running), but the one weāre interested in is a temporal charge refresh algorithm that will refresh cells that get old automatically.
Thereās a whole thread on the phenomenon: