I'm in over my head and need help: BTRFS file recovery

background

Hi all,
I was experimenting with waydroid. I accidentally deleted (rm -rf) my documents, music, and pictures folders that were linked to the waydroid directories. In my panic, I accidentally ran test disk and incorrectly set the type as MBR.

I tried using chatgpt, but I feel I’m getting in a circle. I found this tool that can recover it, but I am having trouble with the outputs
https://www.lukeshu.com/blog/btrfs-rec.html

 btrfs-rec inspect rebuild-mappings \
                                                       --pv=/run/media/bedhedd/Expansion/disk_clone/backup.img \
                                                       > mappings-1.json
12:38:23.0537 ERR : goroutine "/main" exited with error: device file "/run/media/bedhedd/Expansion/disk_clone/backup.img": superblock 0: unknown checksum type: 22
12:38:23.0539 INF thread=:shutdown_logger : shutting down (gracefully)...

12:38:23.0539 INF thread=:shutdown_status :   final goroutine statuses:
12:38:23.0539 INF thread=:shutdown_status :     /main: exited with error
btrfs-rec: error: device file "/run/media/bedhedd/Expansion/disk_clone/backup.img": superblock 0: unknown checksum type: 22

current status

I created a dd backup of the image

debugging commands

sda1 is the nvme

sudo btrfs check --readonly /dev/sda1
Opening filesystem to check...
Checking filesystem on /dev/sda1
UUID: 027f2550-4813-20d9-ac54-fc87dc4612eb
[1/8] checking log skipped (none written)
[2/8] checking root items
[3/8] checking extents
ERROR: block device size is smaller than total_bytes in device item, has 2199023255040 expect >= 4096804589568
ERROR: errors found in extent allocation tree or chunk allocation
[4/8] checking free space cache
cache and super generation don't match, space cache will be invalidated
[5/8] checking fs roots
[6/8] checking only csums items (without verifying data)
[7/8] checking root refs
[8/8] checking quota groups skipped (not enabled on this FS)
found 924773294080 bytes used, error(s) found
total csum bytes: 901903928
total tree bytes: 1090568192
total fs tree bytes: 44154880
total extent tree bytes: 17858560
btree space waste bytes: 121016944
file data blocks allocated: 1541724680192
 referenced 1541646524416

I created a backup in a separate drive and mounted it

sudo losetup --find --show backup.img
sudo btrfs check --readonly /dev/loop0p1
Opening filesystem to check...
Checking filesystem on /dev/loop0p1
UUID: 027f2550-4813-20d9-ac54-fc87dc4612eb
[1/8] checking log skipped (none written)
[2/8] checking root items
[3/8] checking extents
[4/8] checking free space cache
[5/8] checking fs roots
[6/8] checking only csums items (without verifying data)
[7/8] checking root refs
[8/8] checking quota groups skipped (not enabled on this FS)
found 924660998144 bytes used, no error found
total csum bytes: 901903928
total tree bytes: 1089945600
total fs tree bytes: 44154880
total extent tree bytes: 17743872
btree space waste bytes: 120964529
file data blocks allocated: 1541613006848
 referenced 1541534851072

Fam, NGL this sounds beyond repairable if you messed with the partition table.

I don’t want to give the verdict as dead, because I know there’s ways to recover old partition tables and the data from the disk, if the table matches exactly what was on the disk previously. But idk the intricate details of how to do that.

I’d make another copy of that img file, rename it to “btrfs.raw” (assuming that’s the whole disk and not just the partition that you dd copied) and create a VM around it. Boot a VM with a live rescue usb of your preferred linux distro and start from there.

With the vdisk showing up as an actual device (i.e. /dev/sda or /dev/vda depending on the driver you use, like SCSI or libvirt), you might be able to perform more operations with the image. If you break it, make a new copy of the image.

2 Likes

sounds like it’s joever, but…

I did some further tests with chatgpt, it suggested

sudo mount -o ro,subvol=image /dev/loop0p1 /mnt/old-root

then

ls -lah /mnt/old-root
total 839G
drwxr-xr-x. 1 USER USER   16 Dec 31  1969 ./
drwxr-xr-x. 8 root    root    4.0K Jan 31 13:33 ../
-rw-------. 1 USER USER 932G Dec 31  1969 ntfs.img

This December 31st might be a good sign

file /mnt/old-root/ntfs.img
/mnt/old-root/ntfs.img: DOS/MBR boot sector, code offset 0x52+2, OEM-ID "NTFS    ", sectors/cluster 8, Media descriptor 0xf8, sectors/track 63, heads 255, hidden sectors 2048, dos < 4.0 BootSector (0x80), FAT (1Y bit by descriptor); NTFS, sectors/track 63, sectors 1953519615, $MFT start cluster 786432, $MFTMirror start cluster 2, bytes/RecordSegment 2^(-1*246), clusters/index block 1, serial number 0a04673e04672c3f

then it suggested

sudo mkdir -p /mnt/ntfs-mount
sudo mount -o loop,ro /mnt/old-root/ntfs.img /mnt/ntfs-mount

I’m copying the files right now. it is currently copying the music directory.

maybe we might be back, ntfs image looks like it is from august. Not ideal, but better than trying to rebuild from a photorec dump

2 Likes

Is this the right command for the complete disk?

sudo dd if=/dev/sda of="/run/media/bedhedd/Expansion/disk_clone/backup.img" bs=64K status=progress

Yes, that would do it. That assumes that /dev/sda doesn’t have any partitions mounted anywhere.

I would probably increase the block size to bs=4M, but that’s just me. The block size mostly affects performance on read/write than anything else.

oh I already completed that. Do I need to rerun it with the bs=4M?

Also it looks like the NTF image is from august 2024. Is there a way to use it as a reference to recover further?

No, it’s just performance stuff, it might take longer, but it’ll go through.

Not that I’m aware of. I’m pretty sure the disk needs to be scanned with some utility, to detect when a certain file system begins and ends and where a new one begins. If the whole /dev/sda was formatted as btrfs (without partitions), then you’d be kinda screwed (because the fs data might have been overwritten when the partition table was created at the beginning of the disk). If it had partitions before, then just the partition table got overwritten and a scan utility should be able to find that partition 1 started at block 2048 and ended at block n, then partition 2 started at n+1 and ended at m etc.

When they all get discovered, the partition table can be recreated and the partition attempted to be mounted. As for btrfs recovery, you’d need another thing that looks at the fs level and sees data that’s marked for deletion, but is not shredded from the image. Again, I’m not entirely sure how or what utility to use, I only know some basic theory on it (from reading what others did in their adventures of recovering data).

1 Like

did some further debugging with chatgpt
started with loading in my fixed gpt of the backup image

sudo losetup -Pf --show backup.img

then the untouched clone of the disk

sudo losetup -Pf --show backup_copy.img

it then suggested sudo btrfs inspect-internal dump-super /dev/loop1p1 because I gave it the constraint that I cannot mount backup_copy without freezing/crashing my system. Just to double check I did the same for the fixed gpt version

udo btrfs inspect-internal dump-super /dev/loop1p1
superblock: bytenr=65536, device=/dev/loop1p1
---------------------------------------------------------
csum_type		0 (crc32c)
csum_size		4
csum			0x0a6524c4 [match]
bytenr			65536
flags			0x1
			( WRITTEN )
magic			_BHRfS_M [match]
fsid			027f2550-4813-20d9-ac54-fc87dc4612eb
metadata_uuid		027f2550-4813-20d9-ac54-fc87dc4612eb
label			MassStorage
generation		9995
root			2050322595840
sys_array_size		97
chunk_root_generation	5834
root_level		1
chunk_root		272467312640
chunk_root_level	1
log_root		0
log_root_transid (deprecated)	0
log_root_level		0
total_bytes		4096804589568
bytes_used		924773294080
sectorsize		4096
nodesize		16384
leafsize (deprecated)	16384
stripesize		4096
root_dir		6
num_devices		1
compat_flags		0x0
compat_ro_flags		0x0
incompat_flags		0x371
			( MIXED_BACKREF |
			  COMPRESS_ZSTD |
			  BIG_METADATA |
			  EXTENDED_IREF |
			  SKINNY_METADATA |
			  NO_HOLES )
cache_generation	5841
uuid_tree_generation	5841
dev_item.uuid		35568c84-3602-0f27-1ecb-befaf5fc5d35
dev_item.fsid		027f2550-4813-20d9-ac54-fc87dc4612eb [match]
dev_item.type		0
dev_item.total_bytes	4096804589568
dev_item.bytes_used	996745871360
dev_item.io_align	4096
dev_item.io_width	4096
dev_item.sector_size	4096
dev_item.devid		1
dev_item.dev_group	0
dev_item.seek_speed	0
dev_item.bandwidth	0
dev_item.generation	0
sudo btrfs inspect-internal dump-super /dev/loop0p1
superblock: bytenr=65536, device=/dev/loop0p1
---------------------------------------------------------
csum_type		0 (crc32c)
csum_size		4
csum			0xf363205e [match]
bytenr			65536
flags			0x1
			( WRITTEN )
magic			_BHRfS_M [match]
fsid			027f2550-4813-20d9-ac54-fc87dc4612eb
metadata_uuid		027f2550-4813-20d9-ac54-fc87dc4612eb
label			MassStorage
generation		10002
root			219058307072
sys_array_size		97
chunk_root_generation	10000
root_level		1
chunk_root		272467378176
chunk_root_level	1
log_root		0
log_root_transid (deprecated)	0
log_root_level		0
total_bytes		4096804589568
bytes_used		924661456896
sectorsize		4096
nodesize		16384
leafsize (deprecated)	16384
stripesize		4096
root_dir		6
num_devices		1
compat_flags		0x0
compat_ro_flags		0x0
incompat_flags		0x371
			( MIXED_BACKREF |
			  COMPRESS_ZSTD |
			  BIG_METADATA |
			  EXTENDED_IREF |
			  SKINNY_METADATA |
			  NO_HOLES )
cache_generation	10002
uuid_tree_generation	10002
dev_item.uuid		35568c84-3602-0f27-1ecb-befaf5fc5d35
dev_item.fsid		027f2550-4813-20d9-ac54-fc87dc4612eb [match]
dev_item.type		0
dev_item.total_bytes	4096804589568
dev_item.bytes_used	996745871360
dev_item.io_align	4096
dev_item.io_width	4096
dev_item.sector_size	4096
dev_item.devid		1
dev_item.dev_group	0
dev_item.seek_speed	0
dev_item.bandwidth	0
dev_item.generation	0

it summarized the logs as a table

Field loop1p1 (backup_copy.img) loop0p1 (backup.img, GPT version)
Generation 9995 10002
Root Tree 2050322595840 219058307072
Chunk Root Generation 5834 10000
Cache Generation 5841 10002

then it suggested

sudo btrfs restore -l /dev/loop1p1
 tree key (EXTENT_TREE ROOT_ITEM 0) 2050330624000 level 2
 tree key (DEV_TREE ROOT_ITEM 0) 2051375775744 level 1
 tree key (FS_TREE ROOT_ITEM 0) 2051378561024 level 2
 tree key (CSUM_TREE ROOT_ITEM 0) 2051378888704 level 2
 tree key (UUID_TREE ROOT_ITEM 0) 272381034496 level 0
 tree key (256 ROOT_ITEM 0) 272460021760 level 2
 tree key (DATA_RELOC_TREE ROOT_ITEM 0) 219166457856 level 0

After that it suggested I dump the meta into a mnt folder. I didn’t want to do it so I had it dump to a folder within my working directory

mkdir -p ./recovered_meta
sudo btrfs restore --metadata /dev/loop1p1 ./recovered_meta/

It seems to be a ntfs image. I wanted a real time monitor, so it suggested

watch -n 2 ls -lah ./recovered_meta/

I’ll have to wait for it to finish, but this isn’t looking like I can get a more recent version of the files

took a look at it. Unfortunately it did not recover the drive. All it had was the NTFS img. Not ideal, but better than trying to rebuild my folder structure from the photorec dump

hell yeah

ope…

noice recovery, but… don’t do that shit

use ddrescue when handling recoveries…

You need to ddrescue the input drive (sda) to another drive and work from that image, NEVER under any circumstances do you work on the drive to be recovered.

$apt install -y ddrescue or $dnf install -y gddrescue
$ddrescue -d -D -f /dev/sd(input) /dev/sd(output)

you will overwrite the output drive in it’s entirety so don’t fuck this step up

Then run an analysis on the drive from a live CD to try and unfuck the partition table.

sudo btrfs restore -v /dev/sdX /path/to/recovery/directory should suffice using /dev/sdX/
and we will see what it can pull back…

1 Like

dd is fine as @ThatGuyB mentioned (and what @bedHedd did), ddrescue/dd_rescue are only relevant when you have unreable sectors (hardware issues), that doesn’t seem to be the case. There’s also no need to copy it to another block device, a file is a better option and is “portable”.

2 Likes

if he was doin a recovery experiment, your suggestions are fantastic.

But this is apparently mission critical data so he should not pilot a new method for prod

ddrescue and not dd, because he cannot verify that there are no bad sectors, until the block is read.

It’s a schrodinger problem of duality. Drive degradation is not apparent, until it is accessed.

Now, should he be hitting the entire target drive with an additional read when he is handling a recovery?

1 Like

Just used DD to clone the sda1 (and unmounted) drive to a external drive. All commands were run on a cloned image.

I couldn’t find any previous versions of the btrfs meta

1 Like

aa3595c68c457f6f0990f2a0456cd86b

1 Like

fawk

i had a bash script to scrape for old metadata on btrfs, but pretty sure a tech nuked that drive when he accidentally ddrescue’d to the wrong drive.

Too busy this week to attempt recovery and our tools are typically built on the fly for odd ball file systems.

You’re lookin for a header file starting with…actually, i copied and modded a gith8 script, let me…

there it is:

1 Like

so if I understand correctly, I just run the script within the same folder as my backup imgs? What do I put in place of the meta file ($file)?
/dev/loop0 '$file?' /btrfs-meta

I’m not sure what you’re trying to justify, dd will list read errors and an image is already created so it’s a non issue in this case.

must not be mounted, otherwise this program may appear
to work but find nothing.

That’s an odd point of bash for recoveries, gotta umount, then you can work with the drive.

There’s a few other scripts like this.

sauce: the gitbutt page

Gotcha the img is not mounted, just losetup

what is the name of the btrfs meta file I am looking for with the script?

@TryTwiceMedia bump. If you don’t know, I’ll probably revert back to my August backup