Hey wendell, Would you trust 8 or 10tb hdds in an array? I think I wouldn't even trust em with 2 drive redundancy
Is there a system, that has file level redundancy? So if there is a failure during an array rebuild Then you just loose a single file ? And not everythin.
- Do you have reason not to trust them? I don't think 10Tb consumer HDDs have been in operation enough to have many results, but I don't see why there would be an issue except for larger data loss in the event an entire disk is lost. ZFS and Btrfs do software/filesystem raid which I believe generally has better benefits over hardware raid. But my memory doesn't serve me well in the exact positives and negatives.
If the disk fails completely then there's not much you can do. But with snapraid You'd only lose the files on the disks which have failed, and you'd be able to recover anything up until the point where it fails. And if there are unrecoverable blocks then you'd only lose the files which relied on those blocks to recover.
I'm not sure how zfs and btrfs handle it but if the disk dies then the array is gone, if there are unrecoverable blocks then I'm not sure what happens. With regular RAID pretty much any failure will destroy the array.
I'm using ZFS Z1 on my 4x8TB array. So far I've had no errors. I am worried about it however and I do plan to periodically swap out the drives to make sure that I have fresh ones in the system. I'm also going to build a cold storage server eventually that I can use for backups. As for the drives, the Seagate drives are SMR as opposed to PMR.
This is an issue if you plan to do more than archive things on them as SMR intentionally writes over other data blocks to squeeze more data onto the disk. This means that your potential to have errors goes up as the data MUST be written sequentially and the performance drops.
Unless I'm mistaken, both ZFS and BTRFS do integrity checking of all the files/parity so that if there are uncought errors in the drives they don't develop errors. As for how they work, a ZFS Z1 array is functionally the same as RAID 5(single drive parity), Z2 is RAID 6(dual drive parity) and Z3 is RAID 7(triple drive parity). SnapRAID sounds more like a JBOD to me. JBOD I really haven't used JBOD enough to know much about it other than it concatenates drives into a single volume. If I understand correctly, it could be easier to recover files off of a damaged JBOD than a RAID 0 as they aren't striped, but this isn't always true. I'll have to look into SnapRAID.
What I mean is I don't know how zfs and btrfs deal with unrecoverable sectors. With raid the disk will get dropped from the array but I'd like to think that btrfs and zfs would just skip the sector and keep recovering what they can.
snapraid isn't really like raid at all. It works on top of the file system and doesn't change anything on the disks, it just generates parity data so that if a disk fails it can be replaced. But the disks are all independent and still work the same way they normally would.
Unless I'm mistaken they will mark it as broken and then rebuild from parity. Is that what you mean? If the drive continues to fail it will continue to give you warnings but I don't believe it will outright drop the disk. This is my scrub results.
Yeah, if there is parity available then it will fix the block, but I'm talking about a situation in which you have to rebuild the array and there is an unrecoverable sector while rebuilding. If the disk is dropped then the array is lost, but if it skips the sector and continues to recover then you only lose the data that relies on that sector. I'm not sure if this is the way zfs and btrfs work but I would assume that it is.
I mean if you've already lost a disk and have to rebuild, in that case there won't be any parity. If you have two disk parity then it's fine, I just mean if you're in a situation where you've already lost disks and one more failure will be too many. That's what the OP is worried about.
If another disks totally fails then the array is gone, I'm just not sure what happens if it's only an unrecoverable sector. It should be possible to skip it and just deal with some file corruption, but I'm pretty sure with traditional RAID the disk will be dropped which will destroy the array. I'm pretty sure zfs and btrfs won't do anything that dumb.
4TB is the maximum I trust. I like the WesternDigital Reds and WD Re drives I am running. Because the price difference between blacks and the Re drives is too small to make a difference, I am thinking about throwing them into my next rig...
The main issue is simple really. When a drive dies and gets replaced, the array is repaired from parity data. This repair takes a long time and puts a lot of stress on the drives. This means that there's a larger chance that another drive starts throwing up errors because it can't handle the stress. The larger the drives get, the longer a rebuild will take and hence the bigger the chance of another drive crapping out.
Always stress-test your HDDs for at least 24 hours before putting them in an array. Actually ... always stress-test your new HDDs, period.
My preferred method is to - take a screenshot of the S.M.A.R.T. window (I use crystal disk info to get a S.M.A.R.T. readout) - then do a 35-pass full disk wipe with ccleaner - followed by a full error scan using hdtune (don't tick the "quick scan" checkbox, you want to do it thoroughly). - If it gets a perfect report, check the S.M.A.R.T. numbers again and compare them. If the drive makes it through all that (with 10TB drives you'll be talking days or close to a week, IIRC my 4TB ones took 40+ hours to undergo the whole procedure), it'll last a looooong time. However if it starts throwing errors, be glad that you caught it before putting your data on it, because it wasn't going to last long.
yeah so unless you have like luck, and triple drive redundancy... then your screwed pretty much above 4-5tb drives
yeah, this sounds what would be important because i dont want a single block or error to happen during an array reconstruction which ends up DESTROYING my WHOLE array because there is not enough parody to handle a second or third concurrent error
well thats my issue MATHEMATICALLY a hdd has an unrecoverable block or error each whatever trillion operations and even to READ a drive over 5 TB that number of operations is met therefore during a full drive reconstruction on a RAID that error is Mathematically 100% gunna happen if your drives are Larger than Say 4 or 5 TB each statistically speaking
In ZFS and BTRFS, because of the calculations that are done to create a sort of software ECC for the hard drives, those read errors your concerned about theoretically should be corrected as they occur using the data off of other disks in the array. That is a massive, glaring issue with traditional raid. But, ZFS and BTRFS aren't applicable to the same kind of failure.
If its a double parity situation then no there should be enough data on the other disks to correct it without issue. At least that is my understanding, would be great to have this confirmed by @wendell