HOWTO: Clone your SSDs & Boot Drives (for Archival ..and other things)

Hey all,

If you’ve seen my recent build log of my second Threadripper system — this time around, a beefy Workstation — I used a neat trick to get up and running rather quickly. All of mmm say 15mins?

When building my first Threadripper box, at the time I was just getting into Fedora 26 and I suppose, intentionally, that SSD took on the following form

  • Fresh Fedora 26 build

  • Decent partition scheme to start off; never got around to expanding the free-space in /opt.

      NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT  
      sda      8:0    0 238.5G  0 disk              
      ├─sda1   8:1    0   448M  0 part /boot/efi    
      ├─sda2   8:2    0  46.6G  0 part /            
      ├─sda3   8:3    0  29.8G  0 part [SWAP]       
      ├─sda4   8:4    0    14G  0 part /home        
      └─sda5   8:5    0    21G  0 part /opt         
    
  • Basic essential security tasks that I always perform along with replacing Fedora’s default firewall with ufw just cause I prefer it.

  • Copied over all my production SSH keys // yes, I need to hold these in a LUKS encrypted partition, but that’s for a future enhancement.

  • Installed docker and setup my rtmp Twitch streamer (as my virtualised “server” stack doesn’t exist just yet). Yeah, I’m also running an internal Ghost blog for kicks.

      [mdesilva@ballinripper ~]$ docker ps
      CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
      f45b795028fa        ghost               "docker-entrypoint..."   12 days ago         Up 20 hours         0.0.0.0:1940->2368/tcp   ghost
      75d1b1da7e94        rtmp_master         "nginx -g 'daemon ..."   3 weeks ago         Up 20 hours         0.0.0.0:1935->1935/tcp   twitch-rtmp
    
  • Setup /opt rather nicely - it has Nvidia drivers, some custom kernel modules etc. I already have some scripts in place for IOMMU spelunking, couple ISOs for virsh etc.

      [mdesilva@ballinripper ~]$ ll -sh /opt
      total 72K
      4.0K drwxr-xr-x. 2 root     root     4.0K Sep 16 15:26 drivers
      4.0K drwxr-xr-x. 3 root     root     4.0K Oct 27 13:16 ghost-docker
      4.0K drwxr-xr-x. 3 root     root     4.0K Sep 16 14:57 google
      8.0K -rw-rw-r--. 1 mdesilva mdesilva 5.9K Sep 22 01:29 iommu_groups
      4.0K drwxr-xr-x. 2 mdesilva mdesilva 4.0K Sep 19 12:52 isos
      4.0K drwxrwxr--. 3 root     wheel    4.0K Sep 19 14:08 libvirtd
       16K drwx--x---. 2 root     root      16K Sep 16 14:48 lost+found
      4.0K -rwxrwxr-x. 1 mdesilva mdesilva  159 Sep 22 01:28 ls-iommu.sh
      4.0K -rw-rw-r--. 1 mdesilva mdesilva 3.9K Sep 22 01:37 lspci_list
      4.0K drwxrwxr-x. 4 mdesilva mdesilva 4.0K Sep 22 14:28 modules
      4.0K drwxrwxr-x. 3 mdesilva mdesilva 4.0K Sep 20 09:30 projects
      4.0K drwxr-xr-x. 3 root     root     4.0K Sep 30 04:16 stability-testing
      4.0K drwxrwxr-x. 4 mdesilva mdesilva 4.0K Oct 13 12:04 twitch
      4.0K drwxrwxr-x. 3 mdesilva mdesilva 4.0K Oct 27 11:41 twitch-nginx-rtmp-docker
    
  • Woah, I forgot that I’d even setup a VM on the Gigabyte Threadripper box. On the ballinripper post-clone it has auto-magically booted up since I’d confirgured the Asus UEFI to enable virtualisation :rofl: …and it has got itself an IP over DHCP and connecting to it works too!

      [root@ballinripper ~]# virsh list --all
       Id    Name                           State
      ----------------------------------------------------
       1     ubuntu-vm1                     running 
    
  • I had installed the Phoronix benchmarking test-suite. The Cryptography test is a particularly mid-range workload with john-the-ripper etc. phoronix-test-suite benchmark cryptography

  • Black-listed the Nouveau drive in grub with rd.driver.blacklist=nouveau; it helps when installing the Nvidia closed-source drivers.

  • Fedora only needed a quick update with dnf update

As an initial starting point, this wasn’t too bad. So, I had the new(er) Threadripper workstation on the bench, passing POST… what does one do? break out dd.

Update – However, since completing this backup, I found an even better alternative dcfldd which as a “forensic” tool has some sweet hashing, error logging features. In a future version, I’ll most certainly upgrade my script to use this tool.

I wanted to backup the SSDs to image files, and so I hooked up a WD Black 4TB drive to my trusty StarTech Dual-Bay SSD/HDD Toaster (USB3.1 ftw)… and got scripting.

Mind you, this is a fresh ‘rehash’ for public consumption based on my initial crude attempt and isn’t 100% tested – but at least, it’ll serve as a nice starting point for anyone getting started.

  • Didn’t bake in a menu-system or arg support, as this was quick-n-dirty.
  • Specify settings via env vars at the top of the script.
  • Make sure you pick the correct input source device, which is the backup source.
  • Intentionally left out dd's sync option and skip-errors option; I wanted to be notified of any read-errors.
  • Packed into a tarball, which admittedly, compresses already gzipped images. Oh well.
  • Managed to bake 500GB of SSD ‘block-level’ data into a 79GB tarball.
  • Checksums are manually generated. The aforementioned dcfldd does this ‘out of the box’ and can even include error logging, so I’d suggest using that!

Update – I’ve removed the old version;

See improved version in Post #5 below - HOWTO: Clone your SSDs & Boot Drives (for Archival ..and other things)

Once the backup is done, I wanted to verify the generated files. For this task I whipped up a simple Makefile,

	[root@ballinripper wdblack]# make verify               
	md5sum -c ballinripper_ssd_backups_08112017.chk        
	ballinripper_fedora26_backup_bs64K_08112017.img.gz: OK 
	ballinripper_fedora26_list_fdisk_08112017.info: OK     
	ballinripper_windows10_backup_bs64K_08112017.img.gz: OK
	ballinripper_windows10_list_fdisk_08112017.info: OK    

Both my SSDs are now backed up, and images are now stored on my FreeNAS pile o’ rust. Leave a like and thoughts - how have you tackled this in the past?

I’ve scripted automated backups and uploads to AWS S3 and may tailor this setup to do the same, such as 3-5 images held in rotation, per SSD. It would be neat to have a daily backup in the cloud should I ever need it, but that’s a later task. @SgtAwesomesauce @wendell @ryan @Raziel

Thanks!!

3 Likes

Just wondering why you’re going the long route and write this script yourself, when things like Clonezilla exist? :confused: I don’t see much of a difference in terms of the result.
If you don’t want to remember which options you chose each time you can also enter the command line there as well.

1 Like

@bsodmike

I appreciate the tag. I am not experienced (face turns red and chokes) with Linux. Part of the reason I came here was to learn Linux in time. Historically most of the programs I used favored Windows and Intel/Nvidia. Times are a changing as this is my first full AMD build, perhaps it is time to get serious about Linux.

I use cloning software for my backups. I have SSD back ups in my box and an external HDD set up which is disconnected while not in use. I do plan to set up a home network in time, but time is short right now. I do believe I am in the right spot for that as well when the time comes.

This is a good write-up. Thanks for the tag. :blush:

For the record, I use borg for backups. You might be interested in it. It supports dedup and pruning.

2 Likes

Here’s quick update -

Improved version with dcfldd

#!/bin/bash
# Script to backup a source device with checksums, nicely packaged
# into a tarball for archiving.
#
# Copyright (c) 2017 Michael de Silva, CTO Inertialbox (inertialbox.com)
# Blog: mwdesilva.com  // Expertise: desilva.io // Twitter: @bsodmike
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:

# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.

# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN

set -e

NOW="$(date +"%d%m%Y")"
BACKUP_SYSTEM="threadripper_server"
BACKUP_SYSTEM_OS="fedora26"
BACKUP_PATH="/mnt/wdblack"
IMAGE_REF="${BACKUP_SYSTEM}_${BACKUP_SYSTEM_OS}"
INPUT_DEVICE="/dev/sda"
BACKUP_DIRNAME="${IMAGE_REF}_ssd_dev_sda_backup_${NOW}"
BACKUP_DIR="${BACKUP_PATH}/${BACKUP_DIRNAME}"
BYTE_SIZE="64K"
IMAGE_OUTPUT="${IMAGE_REF}_backup_bs${BYTE_SIZE}_$NOW"

echo ""
echo "Commencing backup..."
mkdir -p ${BACKUP_DIR}
dcfldd if=${INPUT_DEVICE} bs=${BYTE_SIZE} hash=md5,sha512 \
  hashlog=${BACKUP_DIR}/${IMAGE_REF}_backedup_image_hashlog_${NOW}.chk \
  status=on | gzip -c  > ${BACKUP_DIR}/${IMAGE_OUTPUT}.img.gz

fdisk -l ${INPUT_DEVICE} > ${BACKUP_DIR}/${IMAGE_REF}_backup_list_fdisk_$NOW.info

# Persist record of MD5 checksums for generated backups
echo ""
echo "Persisting MD5 Checksums of all generated files..."
md5sum ${BACKUP_DIR}/* > ${BACKUP_DIR}/${BACKUP_DIRNAME}_md5_checksums.chk

# Compress and generate tarball
echo ""
echo "Compressing and generating tarball..."
tar -zcvf ${BACKUP_PATH}/${BACKUP_DIRNAME}.tar.gz ${BACKUP_DIR}

echo ""
echo "Backup complete!"
echo ""

This is a vastly improved version, now using dcfldd as I’d previously considered. Why? Well, I re-ran the SSD backups on the Gigabyte Threadripper box as well.

This is what a backup run looks like -

	[root@threadripper scripts]# ./ssd_backup.sh

	Commencing backup...
	7814144 blocks (244192Mb) written.
	7814346+1 records in
	7814346+1 records out

	Persisting MD5 Checksums of all generated files...

	Compressing and generating tarball...
	tar: Removing leading `/' from member names
	/mnt/wdblack/threadripper_server_fedora26_ssd_dev_sda_backup_10112017/
	/mnt/wdblack/threadripper_server_fedora26_ssd_dev_sda_backup_10112017/threadripper_server_fedora26_backup_bs64K_10112017.img.gz
	/mnt/wdblack/threadripper_server_fedora26_ssd_dev_sda_backup_10112017/threadripper_server_fedora26_backup_list_fdisk_10112017.info
	/mnt/wdblack/threadripper_server_fedora26_ssd_dev_sda_backup_10112017/threadripper_server_fedora26_backedup_image_hashlog_10112017.chk
	/mnt/wdblack/threadripper_server_fedora26_ssd_dev_sda_backup_10112017/threadripper_server_fedora26_ssd_dev_sda_backup_10112017_md5_checksums.chk

	Backup complete!

…and the generated files, having made two runs (i) /dev/sda and (ii) /dev/sdb

	[root@threadripper wdblack]# ll -h
	total 79G
	drwx------. 2 root root  16K Nov  8 16:56 lost+found
	drwxr-xr-x. 2 root root 4.0K Nov 10 04:50 threadripper_server_fedora26_ssd_dev_sda_backup_10112017
	-rw-r--r--. 1 root root  48G Nov 10 04:42 threadripper_server_fedora26_ssd_dev_sda_backup_10112017.tar.gz
	drwxr-xr-x. 2 root root 4.0K Nov 10 05:27 threadripper_server_windows10_ssd_dev_sdb_backup_10112017
	-rw-r--r--. 1 root root  31G Nov 10 05:48 threadripper_server_windows10_ssd_dev_sdb_backup_10112017.tar.gz

In this version I wanted the file-naming scheme to be spot on and cleaned up quite a few mistakes in the earlier version shared.

Hi mate,

For a couple reasons, one of which is ‘continuous personal development & improvement’ (Yeah, I like to maintain a TIL CI/CD for my human side :laughing:) and I find the best way is to try build something yourself.

Well, the original attempt only took me 5-10 mins. It would (and still would) take me longer to use something like Clonezilla; heck reading the docs will take longer than that.

However, I did take time to revise this into a starting point for an actually useful script (which needs further improving, more on this later) that I could use in a production situation.

I try to take a DevOps approach to my app and systems coding - here’s a contrived scenario. Assume I’m working on an AWS EC2 instances, and for whatever contrived reason I want to move a partition (of data) whilst guaranteeing that state is maintained (for said data in that partition) between a local instance (metal or VM). This would be an interesting thing to solve, but I could use my script for this purpose now.

Rather than pointing it at backing up an entire device, I can pass /dev/sdXy as the INPUT_DEVICE in the improved script.

That said, I’ll also take a look at borg as I’ve been curious to have a play with that as well.

It’s also easi(er) for me to maintain something that I’ve dissected completely, and understood all the pain-points within. I guess that’s really what I’m after, and pretty much my ‘day’ job (albeit I work completely at night, due to syncing with PST and living in Sri Lanka).

99% of troubleshooting (systems, to bug fixes) boils down to

  • Ascertain the general cause of the issue
  • Actually fine the cause, that proverbial needle in the hay-STACK.
  • Apply a bandage, test in test/staging env.
  • Craft a worthy fix, author the most epic git commit ever, pat yourself in the back…

Understand, determine, and patch that b***ch up! How’s that for a DevOps mantra? @SgtAwesomesauce?

Further improvements, which I’ll probably end up doing…

  • Needs a help log
  • Take args for the basic input variables
  • Better error logging
  • Tie in uploads to AWS with the excellent Ruby-based backup gem.
1 Like

Yeah, that’s understandable :slight_smile:

Was just wondering. Reading docs isn’t really necessary in Clonezilla imho, the bootable CD menu is pretty straight forward and has descriptions for everything.

I guess when you’re backuping to AWS or whatever at the same time something homebrew is indeed a better solution, for pure backup though I really like Clonezilla. It uses standard unix tools underneath anyway.

2 Likes

Definitely sounds like a good mantra. Mine has felt more like Scramble, Commit, Iterate! lately.

Throw this up on github. (or wherever) and we can turn it into a collaborative project. I’m not experienced with Ruby (python guy over here), but it can’t be too hard to pick up.

2 Likes

If you guys wanna use my gitlab for projects, feel free.

Link in my bio.

2 Likes

I appreciate the offer. I’ll definitely consider it.

1 Like

You’re welcome :smiley: I like it when people use my stuff.

1 Like

Same here, kind of you -thanks!