BTRFS broken after Clonezilla/partclone backup

Hey there,

I wanted to move my homelab from bare metal deployment (gentoo) to a virtualization platform. To do the migration I used Clonezilla to create a partclone backup of my 3 partitions:

512M EFI FAT32
1G BOOT ext4
512G LUKS ROOT with btrfs

Now after restoring the partitions I sadly cannot mount the btrfs partition anymore, and of course the clonezilla backup was supposed to be my working backup.
Mounting the partition just results in getting the following dmesg errors:

[14782.010972] BTRFS error (device dm-0): bad tree block start, mirror 1 want 375878467584 have 16951924986927170840
[14782.011294] BTRFS error (device dm-0): bad tree block start, mirror 2 want 375878467584 have 99318939617790344
[14782.011778] BTRFS error (device dm-0): failed to read chunk root
[14782.012432] BTRFS error (device dm-0): open_ctree failed

Using btrfs-find-root prints

WARNING: cannot read chunk root, continue anyway
Superblock thinks the generation is 613435
Superblock thinks the level is 1

Neither using btrfs check --repair nor using the btrfs recover commands works.
However the data (or part of it) is clearly there, as using e.g. “strings” on the image shows the content of some known files.
I also found this related issue “Thomas-Tsai/partclone issues/163” (Can’t post links yet :/) on Github but there does not seem to be any solution yet.

Now I am not that familiar with btrfs internals, therefore I am asking for your help hoping to maybe get my data back.

Got any snapshots sent elsewhere? If yes, just do a btrfs receive.

Also btrfs send -f <file> is all you need as a backup. No need for external tools (that might break things and are slow).

Why did the backup damage the source? It’s only supposed to read from it.

Posts in the issue tracker show encryption as well, might be related to this. Or some general bug in the backup software.

Sadly I don’t have any snapshots outside of the partition.
I thought using clonezilla would be rather safe D:
The backup did not damage the source but the source and destination are the same hardware so I restored the backup into the vm after overriding the ssd with proxmox and then noticed the issue.
server ssd → nas, proxmox → server ssd, nas → proxmox virtual hdd

According to “rescue chunk-recover” all chunks are alright

btrfs rescue chunk-recover -v /dev/mapper/luks

All Devices:
Device: id = 1, name = /dev/mapper/luks

Scanning: DONE in dev0
DEVICE SCAN RESULT:
Filesystem Information:
sectorsize: 4096
nodesize: 16384
tree root generation: 613435
chunk root generation: 613432

All Devices:
Device: id = 1, name = /dev/mapper/luks

All Block Groups:
Block Group: start = 136429174784, len = 1073741824, flag = 1
Block Group: start = 137502916608, len = 1073741824, flag = 1
Block Group: start = 138576658432, len = 1073741824, flag = 1
Block Group: start = 139650400256, len = 1073741824, flag = 1
Block Group: start = 140724142080, len = 1073741824, flag = 1
Block Group: start = 141797883904, len = 1073741824, flag = 1
Block Group: start = 142871625728, len = 1073741824, flag = 1
Block Group: start = 143945367552, len = 1073741824, flag = 1
…
Block Group: start = 373726117888, len = 1073741824, flag = 24
Block Group: start = 374799859712, len = 1073741824, flag = 1
Block Group: start = 375873601536, len = 33554432, flag = 22
Block Group: start = 375907155968, len = 1073741824, flag = 24
Block Group: start = 376980897792, len = 1073741824, flag = 1
Block Group: start = 378054639616, len = 1073741824, flag = 1
…

All Chunks:
Chunk: start = 136429174784, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 1048576
Chunk: start = 137502916608, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 1074790400
Chunk: start = 138576658432, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 68758274048
Chunk: start = 139650400256, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 69832015872
Chunk: start = 140724142080, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 70905757696
Chunk: start = 141797883904, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 71979499520
Chunk: start = 142871625728, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 73053241344
…
[ 0] Stripe: devid = 1, offset = 277064187904
Chunk: start = 373726117888, len = 1073741824, type = 24, num_stripes = 2
Stripes list:
[ 0] Stripe: devid = 1, offset = 55873372160
[ 1] Stripe: devid = 1, offset = 56947113984
Chunk: start = 374799859712, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 102044270592
Chunk: start = 375873601536, len = 33554432, type = 22, num_stripes = 2
Stripes list:
[ 0] Stripe: devid = 1, offset = 2253389824
[ 1] Stripe: devid = 1, offset = 2286944256
Chunk: start = 375907155968, len = 1073741824, type = 24, num_stripes = 2
Stripes list:
[ 0] Stripe: devid = 1, offset = 61242081280
[ 1] Stripe: devid = 1, offset = 62315823104
Chunk: start = 376980897792, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 278137929728
…
All Device Extents:
Device extent: devid = 1, start = 1048576, len = 1073741824, chunk offset = 136429174784
Device extent: devid = 1, start = 1074790400, len = 1073741824, chunk offset = 137502916608
Device extent: devid = 1, start = 2253389824, len = 33554432, chunk offset = 375873601536
Device extent: devid = 1, start = 2286944256, len = 33554432, chunk offset = 375873601536
Device extent: devid = 1, start = 2320498688, len = 1073741824, chunk offset = 400603217920
Device extent: devid = 1, start = 3394240512, len = 1073741824, chunk offset = 401676959744
Device extent: devid = 1, start = 4467982336, len = 1073741824, chunk offset = 402750701568
Device extent: devid = 1, start = 16144924672, len = 1073741824, chunk offset = 396308250624
Device extent: devid = 1, start = 17218666496, len = 1073741824, chunk offset = 397381992448
Device extent: devid = 1, start = 18292408320, len = 1073741824, chunk offset = 403824443392
Device extent: devid = 1, start = 19366150144, len = 1073741824, chunk offset = 404898185216
Device extent: devid = 1, start = 20439891968, len = 1073741824, chunk offset = 405971927040
Device extent: devid = 1, start = 21513633792, len = 1073741824, chunk offset = 407045668864
…

CHECK RESULT:
Recoverable Chunks:
Chunk: start = 136429174784, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 1048576
Block Group: start = 136429174784, len = 1073741824, flag = 1
Device extent list:
[ 0]Device extent: devid = 1, start = 1048576, len = 1073741824, chunk offset = 136429174784
Chunk: start = 137502916608, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 1074790400
Block Group: start = 137502916608, len = 1073741824, flag = 1
Device extent list:
[ 0]Device extent: devid = 1, start = 1074790400, len = 1073741824, chunk offset = 137502916608
…
Chunk: start = 375873601536, len = 33554432, type = 22, num_stripes = 2
Stripes list:
[ 0] Stripe: devid = 1, offset = 2253389824
[ 1] Stripe: devid = 1, offset = 2286944256
Block Group: start = 375873601536, len = 33554432, flag = 22
Device extent list:
[ 0]Device extent: devid = 1, start = 2286944256, len = 33554432, chunk offset = 375873601536
[ 1]Device extent: devid = 1, start = 2253389824, len = 33554432, chunk offset = 375873601536
Chunk: start = 375907155968, len = 1073741824, type = 24, num_stripes = 2
Stripes list:
[ 0] Stripe: devid = 1, offset = 61242081280
[ 1] Stripe: devid = 1, offset = 62315823104
Block Group: start = 375907155968, len = 1073741824, flag = 24
Device extent list:
[ 0]Device extent: devid = 1, start = 62315823104, len = 1073741824, chunk offset = 375907155968
[ 1]Device extent: devid = 1, start = 61242081280, len = 1073741824, chunk offset = 375907155968
Chunk: start = 376980897792, len = 1073741824, type = 1, num_stripes = 1
Stripes list:
[ 0] Stripe: devid = 1, offset = 278137929728
Block Group: start = 376980897792, len = 1073741824, flag = 1
Device extent list:
[ 0]Device extent: devid = 1, start = 278137929728, len = 1073741824, chunk offset = 376980897792
…
Unrecoverable Chunks:

Total Chunks: 172
Recoverable: 172
Unrecoverable: 0

Orphan Block Groups:

Orphan Device Extents:

corrupt node: root=1 block=373727870976 slot=0, corrupt node: root=1 block=373727870976, nritems too large, have 15 expect range [1,0]
Couldn’t read tree root
open with broken chunk error
Chunk tree recovery failed

So, you restored the backup (overwriting the source)? And now the file system won’t mount any more?

Can you see in clonezilla’s logs exectly which commands were executed?

Is it possible to mount the image clonezilla created (without/before restoring)?

Exactly. I still have the clonezilla backup but the source of the backup is gone now.
Here is the full log:

Clonezilla log
This image was saved by Clonezilla at 2023-05-22 22:55:49 UTC.
Saved by clonezilla-live-3.1.0-22-amd64.
The log during saving:
----------------------------------------------------------
Starting /usr/sbin/ocs-sr at 2023-05-22 22:10:01 UTC...
*****************************************************.
Clonezilla image dir: /home/partimag
Shutting down the Logical Volume Manager
Finished Shutting down the Logical Volume Manager
The selected devices: nvme0n1p1 nvme0n1p2 nvme0n1p3
PS. Next time you can run this command directly:
/usr/sbin/ocs-sr -luks yes -q2 -c -j2 -z1p -i 0 -sfsck -senc -p choose saveparts 2023-05-22-22-img nvme0n1p1 nvme0n1p2 nvme0n1p3
*****************************************************.
The selected devices: nvme0n1p1 nvme0n1p2 nvme0n1p3
The following step is to save the hard disk/partition(s) on this machine as an image:
*****************************************************.
Machine: NUC11TNHi5
nvme0n1p1 (512M_vfat(In_Samsung_SSD_980_PRO_1TB_)_Samsung_SSD_980_PRO_1TB_XXXX)
nvme0n1p2 (1G_ext4(In_Samsung_SSD_980_PRO_1TB_)_Samsung_SSD_980_PRO_1TB_XXXX)
nvme0n1p3 (512G_crypto_LUKS(In_Samsung_SSD_980_PRO_1TB_)_Samsung_SSD_980_PRO_1TB_XXXX)
*****************************************************.
-> "/home/partimag/2023-05-22-22-img".
Shutting down the Logical Volume Manager
Finished Shutting down the Logical Volume Manager
*****************************************************.
Now open the LUKS device: /dev/nvme0n1p3
Running: cryptsetup open --type luks /dev/nvme0n1p3 ocs_luks_9X5
Starting saving /dev/nvme0n1p1 as /home/partimag/2023-05-22-22-img/nvme0n1p1.XXX...
/dev/nvme0n1p1 filesystem: vfat.
*****************************************************.
*****************************************************.
Use partclone with pigz to save the image.
Image file will not be split.
*****************************************************.
If this action fails or hangs, check:
* Is the disk full ?
*****************************************************.
Running: partclone.vfat -z 10485760 -N -L /var/log/partclone.log -c -s /dev/nvme0n1p1 --output - | pigz -c --fast -b 1024 --rsyncable > /home/partimag/2023-05-22-22-img/nvme0n1p1.vfat-ptcl-img.gz 2> /tmp/img_out_err.1fS5Ck
Partclone v0.3.23 XXXXXX
Starting to clone device (/dev/nvme0n1p1) to image (-)
Reading Super Block
memory needed: 21102596 bytes
bitmap 131072 bytes, blocks 2*10485760 bytes, checksum 4 bytes
Calculating bitmap... Please wait... 
done!
File system:  FAT32
Device size:  536.9 MB = 1048576 Blocks
Space in use:   1.2 MB = 2384 Blocks
Free Space:   535.7 MB = 1046192 Blocks
Block size:   512 Byte
Total block 1048576
Syncing... OK!
Partclone successfully cloned the device (/dev/nvme0n1p1) to the image (-)
>>> Time elapsed: 7.20 secs (~ .120 mins)
*****************************************************.
Finished saving /dev/nvme0n1p1 as /home/partimag/2023-05-22-22-img/nvme0n1p1.vfat-ptcl-img.gz
*****************************************************.
Starting saving /dev/nvme0n1p2 as /home/partimag/2023-05-22-22-img/nvme0n1p2.XXX...
/dev/nvme0n1p2 filesystem: ext4.
*****************************************************.
*****************************************************.
Use partclone with pigz to save the image.
Image file will not be split.
*****************************************************.
If this action fails or hangs, check:
* Is the disk full ?
*****************************************************.
Running: partclone.ext4 -z 10485760 -N -L /var/log/partclone.log -c -s /dev/nvme0n1p2 --output - | pigz -c --fast -b 1024 --rsyncable > /home/partimag/2023-05-22-22-img/nvme0n1p2.ext4-ptcl-img.gz 2> /tmp/img_out_err.mq0Qzr
Partclone v0.3.23 XXXXXXXX
Starting to clone device (/dev/nvme0n1p2) to image (-)
Reading Super Block
memory needed: 21102596 bytes
bitmap 131072 bytes, blocks 2*10485760 bytes, checksum 4 bytes
Calculating bitmap... Please wait... 
done!
File system:  EXTFS
Device size:    1.1 GB = 1048576 Blocks
Space in use: 233.7 MB = 228201 Blocks
Free Space:   840.1 MB = 820375 Blocks
Block size:   1024 Byte
Total block 1048576
Syncing... OK!
Partclone successfully cloned the device (/dev/nvme0n1p2) to the image (-)
>>> Time elapsed: 7.17 secs (~ .119 mins)
*****************************************************.
Finished saving /dev/nvme0n1p2 as /home/partimag/2023-05-22-22-img/nvme0n1p2.ext4-ptcl-img.gz
*****************************************************.
Starting saving /dev/mapper/ocs_luks_9X5 as /home/partimag/2023-05-22-22-img/ocs_luks_9X5.XXX...
/dev/mapper/ocs_luks_9X5 filesystem: btrfs.
*****************************************************.
*****************************************************.
Use partclone with pigz to save the image.
Image file will not be split.
*****************************************************.
If this action fails or hangs, check:
* Is the disk full ?
*****************************************************.
Running: partclone.btrfs -z 10485760 -N -L /var/log/partclone.log -c -s /dev/mapper/ocs_luks_9X5 --output - | pigz -c --fast -b 1024 --rsyncable > /home/partimag/2023-05-22-22-img/ocs_luks_9X5.btrfs-ptcl-img.gz 2> /tmp/img_out_err.n1aNcr
Partclone v0.3.23 XXXXXX
Starting to clone device (/dev/mapper/ocs_luks_9X5) to image (-)
Reading Super Block

btrfsclone.c: btrfs library version = Btrfs v6.0.1 
block_size = 16384
usedblock = 10521443
device_size = 549739036672
totalblock = 33553408
btrfsclone.c: fs_close
memory needed: 25165700 bytes
bitmap 4194176 bytes, blocks 2*10485760 bytes, checksum 4 bytes
Calculating bitmap... Please wait... 

btrfsclone.c: btrfs library version = Btrfs v6.0.1 
done!
File system:  BTRFS
Device size:  549.7 GB = 33553408 Blocks
Space in use: 172.4 GB = 10521443 Blocks
Free Space:   377.4 GB = 23031965 Blocks
Block size:   16384 Byte
Total block 33553408
Syncing... OK!
Partclone successfully cloned the device (/dev/mapper/ocs_luks_9X5) to the image (-)
>>> Time elapsed: 2609.89 secs (~ 43.498 mins)
*****************************************************.
Finished saving /dev/mapper/ocs_luks_9X5 as /home/partimag/2023-05-22-22-img/ocs_luks_9X5.btrfs-ptcl-img.gz
*****************************************************.
Dumping the device mapper table in /home/partimag/2023-05-22-22-img/dmraid.table...
Saving block devices info in /home/partimag/2023-05-22-22-img/blkdev.list...
Saving block devices attributes in /home/partimag/2023-05-22-22-img/blkid.list...
Checking the integrity of partition table in the disk /dev/nvme0n1... 
Reading the partition table for /dev/nvme0n1...RETVAL=0
*****************************************************.
Saving the primary GPT of nvme0n1 as /home/partimag/2023-05-22-22-img/nvme0n1-gpt-1st by dd...
34+0 records in
34+0 records out
17408 bytes (17 kB, 17 KiB) copied, 0.00330889 s, 5.3 MB/s
*****************************************************.
Saving the secondary GPT of nvme0n1 as /home/partimag/2023-05-22-22-img/nvme0n1-gpt-2nd by dd...
32+0 records in
32+0 records out
16384 bytes (16 kB, 16 KiB) copied, 0.0030791 s, 5.3 MB/s
*****************************************************.
Saving the GPT of nvme0n1 as /home/partimag/2023-05-22-22-img/nvme0n1-gpt.gdisk by gdisk...
The operation has completed successfully.
*****************************************************.
Saving the MBR data for nvme0n1...
1+0 records in
1+0 records out
512 bytes copied, 0.00200132 s, 256 kB/s
End of saveparts job for image /home/partimag/2023-05-22-22-img.
Checking if udevd rules have to be restored...
This program is not started by Clonezilla server, so skip notifying it the job is done.
This program is not started by Clonezilla server, so skip notifying it the job is done.
Finished!
Finished!
### End of log ###
### Image created time: 2023-0522-2255
Image was saved by these Clonezilla-related packages:
 drbl-5.2.10-drbl1 clonezilla-5.4.6-drbl1 partclone-0.3.23-drbl-1 util-linux-2.38.1-5+b1 gdisk-1.0.9-2.1
Saved by clonezilla-live-3.1.0-22-amd64.

I tried to mount the image after restoring (after unzipping) but I just get

wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.

Seems to have its own format though, the first few bytes read as follows:

00000000: 7061 7274 636c 6f6e 652d 696d 6167 6500  partclone-image.
00000010: 302e 332e 3233 0000 0000 0000 0000 3030  0.3.23........00
00000020: 3032 dec0 4254 5246 5300 0000 0000 0000  02..BTRFS.......
00000030: 0000 0000 0000 00ff 7f00 0000 00fc ff01  ................
00000040: 0000 0000 638b a000 0000 0000 a04e 3301  ....c........N3.
00000050: 0000 0000 0040 0000 1200 0000 0200 4000  .....@........@.
00000060: 2000 0400 8002 0000 0101 3860 283e 1f00   .........8`(>..
00000070: 0000 0000 0000 ffff ffff ffff ffff ffff  ................
00000080: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000090: ffff ffff ffff ffff ffff ffff ffff ffff  ................

What is the output of partclone.info, or partclone.chkimg?

Looking at the manpage, it seems partclone has an option

       --restore_raw_file
           Creating special raw file for loop device.

so I guess you need to make a raw image file first out of the partclone to be able to mount it.

Looks like the backup log didn’t report any errors, how about the restore log?

Clonezilla does several automatic things, like resizing file systems to fit partitions after restoring, rebuilding initrd, and more. You might want to try running the commands yourself, sticking with only the safe option, and skipping some of the commands Clonezilla uses.

I don’t know if it’ll help, but another thing to try is Rescuezilla:

“It is carefully developed to be fully interoperable with the industry-standard Clonezilla”
“Extract files from your backup images (beta)”
“Recover Lost Data”

There’s also the brtfs restore command which attempts to recover files off of a corrupt btrfs partition.

partclone.info:

Partclone v0.3.23 http://partclone.org
Showing info of image (/mnt/ocs_luks_9X5.btrfs-ptcl-img)
File system:  BTRFS
Device size:  549.7 GB = 33553408 Blocks
Space in use: 172.4 GB = 10521443 Blocks
Free Space:   377.4 GB = 23031965 Blocks
Block size:   16384 Byte

image format:    0002
created on a:    64 bits platform
with partclone:  v0.3.23
bitmap mode:     BIT
checksum algo:   CRC32
checksum size:   4
blocks/checksum: 640
reseed checksum: yes

I will add partclone.chkimg later and try --restore_raw_file

1 Like

Don’t have the restore log anymore since it was done on a live cd but as far as I remember there were no errors as well.

Rescuezilla does not seem to support luks and btrfs.

And the btrfs restore command results in the same error :confused:

IME once a btrfs is compromised, it’s toast. If you’re lucky (and I have been) it mounts read only and you can copy your data.

I suggest restoring the backup to the same hardware it came from, with no changes in partition size. I have little experience but I suspect that restoring into a VM has problems, maybe with clonezilla.