CentOS 7 Imaging, restore, and boot problems

Hi all, I’ll start with some background.

I have a Dell r710 with Centos7 installed on 2x 73gb drives in RAID1 on a Perc6i.
I have setup software and configured it for a specific purpose.

I now want to replace the storage with a RAID6 array of 6 drives.

As the server only has 6 drive bays I wanted to image the original array and restore it to the new one after swapping out the drives and configuring the new array.

I have tried both clonezilla and rescuezilla to image the original array to a usb drive and restore to the new array and both have failed to boot with just a grub> prompt.

I followed this guide linuxsysadmins(dot)com/grub-rescue-in-centos-and-rhel-7/ to try and repair grub using CentOS7 on a usb but still no luck.

I now get grub rescue> instead of just grub>, the error says disk`,gpt2’ not found

I found this page dennisk(dot)freeshell(dot)org/cis238dl_grub that says to run insmod lvm but that also gives me the “error: disk `,gpt2’ not found” error.

At this point I cant find much on google that is relevant, clearly “,gpt2” is very wrong but I’m not sure how to proceed.

Alternatively is there a better way to image and then restore that works with RAID, LVM and UEFI? , I can still replace the two original drives and import the RAID1 array and get back to a working system to try again.

Thanks, Tim.

Telling us your partition and file system types would help.

Clonezilla absolutely should work. Are you using the latest amd64 version?

How about changing boot options like EFI to legacy BIOS before cloning?

If clonezilla won’t work, you could just boot-up with any Linux CD and do a basic disk to image copy:

cat /dev/sda > /mnt/usb/73gb.img

Can you guess the syntax to restore to your new volume?

If even that doesn’t work, consider whether your new drives are actually functioning.

Ok I just tried with dd and still get the same issue.
I booted to a live usb and used dd to clone the entire drive to a sata drive in an external usb dock. I then swapped out the drives and initialised the new array, booted to the live usb again and used dd to clone back to the new array.

The partitions are the default settings for an CentOS 7 install, Screenshot from 2020-10-23 18-52-36

I am beginning to think it might be the drives I got second hand, I have seen reports from other people about drives from netapp systems with unhelpful firmware. The RAID controller doesn’t complain when creating the array but maybe it is still breaking something?

Drives that come from a NetApp system are formatted with 520 byte sectors so they will not work with traditional RAID controllers because they expect 512 byte sectors.

So you need to do a low-level format on all the drives to change the sector size to 512 and then they should work as expected.

2 Likes

Checksum a file on your USB drive. e.g.:

sha1sum 73gb.img

Copy it to a partition on your new array:

mount /dev/sda1 /mnt/hd/
cp 73gb.img /mnt/hd/

Then drop caches, and checksum the destination:

echo 3>/proc/sys/vm/drop_caches
sha1sum /mnt/hd/73gb.img

If they don’t match perfectly, something is very wrong with your array, and you should be glad you found out now instead of later.

All the posts I had seen were saying that if your drives had the wrong sector size the raid controller would not create the array, however mine created it happily so I thought that wasn’t the issue but it might be.

Ill try and format the drives.

OK I’m formatting the drives at the moment and after running:
sg_format --format --size=512 --six -v /dev/sg2

I get this output:

NETAPP X423_HCOBE900A10 NA02 peripheral_type: disk [0x0]
PROTECT=1
<< supports protection information>>
Unit serial number: KPWNXPAL
LU name: 5000cca0225e6a78
mode sense (6) cdb: 1a 00 01 00 fc 00
Mode Sense (block descriptor) data, prior to changes:
Number of blocks=1758174768 [0x68cb9e30]
Block size=512 [0x200]

A FORMAT will commence in 15 seconds
ALL data on /dev/sg3 will be DESTROYED
Press control-C to abort

Does “Block size=512” under prior to changes mean that the drives were already 512 byte sectors?

Do I need to update the drive firmware?

Once all the drives are formatted in a few hours ill try the checksum test.

Ok, I have formatted all the drives to 512 byte sectors (maybe they already were?)
I did a full init on the array.
It passes the checksum test when booting from a live ubuntu usb.

I tried using rescuezilla to restore from an image but it failed when restoring the main LVM2 partition with the error: “read error: no such file or directory”, not very helpful.

I then used dd to copy the entire drive back from an external drive in a usb dock but it still doesnt boot.

Not sure what to try next, maybe ill try swapping the Perc6i with a spare I have just in case?

If it is JUST a boot issue, the mkbootdisk command has worked well for me in the past.

And System Rescue CD has an option to try to boot the local system on hard disk using the included kernel (which causes version issues, but should show the disk is otherwise intact).

Thanks, that might be useful in the future.

Just for a sanity check I tried a clean install of CentOS 7 on the array and even that wouldn’t boot, so I tried switching to bios boot mode instead of UEFI and reinstalling and then it booted. The original install worked fine in UEFI mode with the RAID1 and the same controller so not sure whats going on there?

In the end I just re-setup everything on the new install, as I could copy scripts and config files from the dd clone I made it only took a few hours. (most of that was setting up samba permissions) Luckily I had documented some of my original steps.

Thanks.

1 Like