Hardware VS Software Raid for Data Integrity

I want to setup a basic raid setup on my homes server. It is running Windows Server 2012 (not willing to change O.S') on an HP Z400 with 10GB RAM and a 4C8T 2.66GHz CPU (X5550). I will be storing around 4TB of data (mirrror size).

The data that will be kept on the computer is quite important, and therefore, data integrity is important.

What should I do? Should I use the HP Z400's built-in RAID functionality (no raid battery backup), OR should I do some sort of software-based raid inside of Win Server 2012?

Thanks :slight_smile:

Neither will do anything for data integrity, only redundancy.

Given the options I'd say they're both about the same, but software raid would be more portable in the sense that you could replace the hardware without killing your raid.

Is the data static or will it change frequently?

3 Likes

I plan on only doing raid 1. I thought that if I remove the drive, I could just read it normally?

Not sure what this means

I remember hearing in the past that some forms of raid are more prone to errors, and are harder to recover the data for. Is this true?

If the system is windows. Use hardware raid.

If it is Linux, install ZFS and use some kind of software raid 1. I'm assuming you have more than one drive. Are they are similar sizes?

Alternatively, install BTRFS and use software raid 1 with regular data scrubbing.

You should be able to pull a disk out of raid1 and read it normally, but different implementations may work differently.

By static data I mean data that you put on the disk and leave there rather than data that you modify frequently. I ask because if it's mostly static you could use snapraid to do redundancy and integrity without having to worry about raid.

No raid is immune to data integrity errors but raid5 can become completely unrecoverable if there are the right kind of errors. Raid1 won't protect you from data integrity problems as it has no way of knowing which disk (if any) contains the correct data. Basically if data integrity is important you need to use zfs or btrfs, but if you absolutely have to use Windows then see if snapraid will do what you need, I use it and think it's great but it doesn't suit every use case.

If you are using Windows Server 2012 then the simplest solution to meet your requirement is to create a mirrored storage space with an ReFS virtual disk.

This is all managed by Windows and IMO a much better solution than the inbuilt RAID on a lower end workstation like the Z400.

ReFS offers some of the features of ZFS (it is no where near as mature) and by using a 2-way (2 disks needed) or 3-way (5 disks needed) mirror you protect your data against disk failure and bit-rot.

One point to bear in mind is that the C: (boot drive) still needs to be regular NTFS. My sons system uses 1 x 120GB SSD as C: and 2 x 2TB HDD's in a 2-way mirror with RefS on Windows 10 for all user data and games etc.

The great thing about storage spaces is you can pull the disks from one machine and put them into another Windows box and it will recognise the pool and let you mount it. You can also create the pool with a much bigger size than the physical disks and just add more physical disks to the pool as it fills up. You can also retire disks from the pool and replace them as they age etc.

Like I said, it's not as fully featured as ZFS but if you need to use Windows it works well. Server 2016 has added some more features, but here is a basic guide for Server 2012;

2 Likes

Different drives, same sizes.

Just do some simple raid 1 on a hardware level then. Dont worry about the battery backup raid card. They are a bit over hyped.

1 Like

I will definetly be looking into ReFS and BTRFS. Thanks :slight_smile:

Regarding software vs hardware, which should I do? Is there no single answer for my case, because they both have advantages and disadvantages?

Software typically offers the ability to rebuild without a reboot. BTRFS definitely lets you do this. It can also scrub and restore partial files. Software raid also is more likely to work with slightly mismatching hard drives. Some raid controllers don't like mixing hard drive manufacturers.

1 Like

really no excuse not to RAID 5 these days

The number of drives in a system is a reason not to use RAID 5

Plus RAID 5 is pretty useless in terms of keeping a stable array. When the time comes for a drive in the array to fail. It will fail, and then its neighbour will likely fail, and its neighbour. This is much more likely if your drives are the same manufacturer, and the same production run. Of course having multiple drives from the same production run is much more common in enterprise environments. However, the enterprise works around this by accident simply by having a shelf life for their drives. The enterprise only uses RAID to filter out the unexpected failures.

In the ideal home server, you would use ZFS and raid z. Because then you can mix any drives you want. This would mean you're more likely to have drive failures at non-standard intervals. Helping prevent a cascading rebuild.

If your expected I/O rate is low, then software RAID is safer than a budget hardware RAID controller. You get to control everything with software RAID.

If your expected I/O rate is high, then hardware RAID can offer better performance as the RAID is offloaded to dedicated hardware.

I've seen budget hardware RAID cards initialize the wrong drive on replacement, loose their entire LUN config (which cannot be restored without zeroing all the disks), and the disks may not be portable - in the sense that even if you use RAID1, the drives might not be readable on another (non-RAID) controller.

RAID is only part of the story. You also need to consider how you backup the data, because if you accidentally delete, or overwrite something (or eg get a dose of crypto-malware) the data will still be gone and the only recourse will be to restore a backup copy.

1 Like

So it seems like the general consensus is to use software raid in my situation, because it will be a safer bet, with less of chances of failure (my system also uses ECC RAM with a Xeon, if that matters).

I don't care much about I/O rate, and care more about transfer speed. Most of the data that will be stored on my server will be for long term storage (hence why I care about data integrity), and if I ever do access my server, it will be because I am transferring large files.

I have 5.25" to 3.5" hot-swappable bays. What I intend to do is put a large drive into it, and in case of emergency (fire, water, etc), that single drive will contain all of the data, and can be removed easily, instead of taking the whole system. It will also be used as another form of backup, that will not be attached to the raid. My plan is to incrementally backup the data from the raid to this drive once a week, so if anything gets deleted, this drive will still have it. When the backup is not occurring, the drive will be off.