I have a problem with my Samsung 980 Pro NVME system drive on my Win10 Pro workstation. I want to clone it so I have a backup. But no matter what I try, I can’t get it to work. It’s always ending up in a BSOD or - depending on the software - with an error message.
This is my setup:
I have a ASUS WRX80 SAGE Wifi Motherboard and NVME Raid is activated in BIOS because I need it to run my 4 x 1TB NVME in RAID 0 with the included ASUS Hyper M.2 RAID Card. The system drive is Array 2 (JBOD) and the 4 x 1TB drives are Array 1 (Raid 0).
Honestly, I didn’t noticed, that the system drive has been assigned to an Array. It doesn’t make much sense imho but that’s what it is right now.
I tried Macrium Reflect (Free version) and I’ve had success in the past with this software. But after about 60% progress, the system either crashes or ends up with a BSOD.
I then tried the Windows Clone function but this ended up in a BSOD as well.
Then I tried Clonezilla but it didn’t recognized the system drive.
When I run CHKDSK on the system drive, it ends up with a BSOD as well.
The BSOD error message is:
Stop code: IRQL_NOT_LESS_OR_EQUAL
What failed: rcraid.sys
All drivers are updated to the newest versions.
Macrium Reflect reports the following error:
Backup aborted! - Unable to read from disk - Error Code 23 - Data error (cyclic redundancy check)."
As far as I understand, this indicates some bad sectors on the nvme.
Now I’m really stuck and I don’t know, what else I can try.
I would boot a Linux live USB and use dd or a graphical cloning tool for this. Just make sure you have enough space for that drive!
Easiest is to boot ubuntu and use the built-in disk tool (type disks in the dash search). Then go to the hamburger menu → New Disk image. There you may clone the disk content!
Then just apply the disk image later on with your favorite cloning tool, should work in both Windows and Linux.
If you value what’s on the offending drive, I suggest you should have data-oriented backups as well as cloning the drive. I suspect you are heading for a reinstall and preparing for this might be an idea, like making a list of installed software, and recording product keys.
@jlittle Thanks for your suggestion. All my data is backed up multiple times. I have local backups and also external backups. I just want to clone the system drive to save me a lot of time, when something goes wrong with the disk. I use this workstation for my work as a freelance animator and video editor and I can’t afford long downtimes, especially not when I am on a deadline which happens most of the time I have multiple workstations and servers in the office and I have backup images from all system drives.
Since you’re experiencing a strange error you might want to run a memtest, sometimes you find that it’s just memory issues.
Additionally the BSOD contains What failed: rcraid.sys, so this could also be an issue with the AMD software RAID driver, or a disk issue. I’d also try putting that M.2 SSD into a system using AHCI, then check the smart data. You can also try a raw-disk copy and see if it also fails there.
Thanks @cowphrase. I will run a memtest right now, didn’t thought about that. I will then try a raw-disk copy.
Unfortunatley, the idea with Ubuntu that @wertigon suggested, wasn’t successful. Ubuntu only shows an unknown partition. I think this is because without the raid software, it doen’t work.
The RAID card on your motherboard is probably the culprit for Linux / clonezilla not detecting your RAID setup.
What I would suggest is make a WinPE live USB. Windows should detect it. Following that, try looking for an online tutorial to clone Windows using Windows CMD. You may also copy the files manually and then rebuild the boot files using bcdboot and bootrec. I don’t remember how it was done, but you could clone Windows in 10 CLI commands or so, and repairing the boot records in about another 10 commands.
I also think, that the RAID controller is the problem.
I’m very thankful for all your suggestions and I learned some useful tricks but I decided to go with a fresh Windows installation. I’m doing a MemTest right now and if it doesn’t fail, I will delete the RAID array and start from scratch.
As mentioned before, this is my main workstation and I don’t have more time to test other solutions.
Also worth to mention, you should look into it if you really need RAID anymore. A ZFS / BTRFS configured NAS might serve your purposes better in the long run, so perhaps start planning for that migration?
Both ZFS and BTRFS have better performance than software RAID, and are more flexible to boot. The only reason you would want RAID over a JBOD is if you want to RAID0 because you were running large 5400 RPM HDDs and want to saturate the SATA connection.
SSD is cheap enough these days at roughly $75-$100 per TB (as opposed to $20-$40 per TB for HDD), so it would be well worth it to invest a little, especially as a 2TB drive is good enough for pretty much all purposes. HDDs have pretty much nothing left to bring to a workstation, these days.
I think some of the early 980 pro firmwares are buggy.
Bought one early last year and it was acting strange. After the firmware update it’s been solid.
Just FYI that board is most likely using a software RAID solution (aka. Fakeraid). So all the raidy bits are done by the Windows Driver on the CPU - aka rcraid.sys. If you want to check SMART data of the drive itself in Ubuntu you’ll need to set your BIOS to back to AHCI temporarily.
Apparently Windows Storage Spaces supports RAID 0, but I’m not really familiar with it. So you could try investigating that as an alternative.
That’s a good point. I already checked the firmware it with the Samsung Magician software but because the disk was part of an array, it didn’t recognized it. When i delete the array, I will test the disk and see if I can make a firmware upgrade.
I am aware that software raid is not the best solution. I only did it because the Hyper M.2 card was included with the motherboard and I need a very fast drive with 4 x 1 TB NVMEs for temporary storage. It is part of my workflow and a little bit complicated to explain. But I am fully aware of the risk when making a RAID 0 with 4 drives. Of course, it was completely wrong to configure the system drive as RAID (JBOD) but I think this was done automatically when I switched from AHCI to RAID in the BIOS.
All my project files are stored on a QNAP 80TB NAS as I need to access them from several workstations at the same time. I have a separate 10GbE connection to the NAS. This works very well in my mixed Mac/PC setup.
Wait, so you need 4 times the sequential speeds of PCIe Gen 4.0 reads/writes for your workflow? Because 4x PCIe 4.0 speeds are enough to saturate theoretical 64 GbE connections.
You know your situation best though, just making sure you’ve considered if you actually need it or if it’s just some monkeying about.
If you have a second system, then it might be an idea to try and use that for the clone process.
It’s probably easier to clone a drive when it isn’t actually the running OS drive.
Or like other said use a Linux live environment.
If you had a large Dataset (or Database) from some “data-intensive” scientific field (Bio-anything, Chemistry, Physics) and were doing analysis on it, you may need the random-read more than total bandwidth.
Very good point, then again I think RAID would be a very stupid option for that… It would probably be faster then to let the scientific application do the file juggling by itself for optimal performance, just point it to folder paths.
Then again, a lot of scientific people have not got the first clue about low-level optimizations where my bread & butter is, so… (And no, not everyone needs to have a clue; but in that case, always consult a low level expert!)