Data recovery, photorec problem and cloning

Molehill · November 9, 2022, 8:11am

Long story short, I rm’ed some files and a dir that I want back.

I have tried using ext4magic, while it gets quite a few of the files just not all.
Using photorec does look like it is working better but the drive always disconnects when it gets to recup_dir.10

Also is it possible to clone the free space instead of the whole disk to work on recovering the data later, I would like to get this computer back to work.

Thanks in advance

exovert · November 9, 2022, 8:15pm

the ‘dd’ command can read an entire disk, or partition and output to a file that you can make a loopback mount of. and run photorec/testdisk on that.

dd can make an uh, mountain of problems so be careful. it’s otherwise known as ‘disk destroyer’ and i’ve also definitely done just that, there is no guard rail.

from a live image, where sda is your disk of interest (find it with lsblk), or sda1 is your partition of interest:
dd if=/dev/sda of=myfile bs=1M

(‘pv’ is a less standard/better utility but let’s not complicate it for one task)

if you have read errors with dd (an interpretation of drive disconnects is, that you are) try also gddrescue.

once you have the image you can
losetup -f myfile

and should be findable with photorec/testdisk.

test that before you do anything destructive (including writing any file) to the original partition, good luck.

Molehill · November 9, 2022, 10:54pm

You saying use dd scares me, even though I’ve never made a mistake using it I still fear it every time.

Looking into it and watching Wendell’s old data recovery video I think I know what I should do but I just want to double check

Have the drive that had the files on it connected but unmounted

ddrescume -d -r3 /dev/mapper/VolGroup-home /mnt/Clone.img CloneLog.txt

/dev/mapper/VolGroup-home being the partition that had the files
/mnt being a mounted external hard drive that has a partition large enough.

When reading the man should I run the -f or -n flags?
Is there and advantage to cloning the whole drive over just this one partition.

Danese · November 10, 2022, 11:55am

You might want to elaborate what type of disk this is about? If this is about a SSD, the TRIM feature may have already discarded the pages where your files where located. If so, there is absolutely no chance to get them back.

-f will have effect only when you try to WRITE to a DRIVE or a PARTITION, when WRITING to FILES, it does nothing.

-n will skip scraping which is the process where ddrescue tries to recover the blocks that are hardest to recover by continuously reading them again and again. If the hard to recover blocks are unrecoverable, ddrescue will waste a lot of time on achieving nothing.

I would only skip scraping if the drive is so damaged that parts of it are unrecoverable.

EDIT: A personal preference is never to use “dd” for data recovery. If using “dd” to clone healthy partitions, you should use the “iflag=fullblock” argument.

Molehill · November 10, 2022, 7:52pm

It is a healthy SSD.
I don’t mind if I don’t get back all of the file I just want as many as possible.

Danese · November 11, 2022, 10:46am

Data recovery of deleted content from a SSD is usually not possible since the drives internal firmware will eventually erase freed flash memory by resetting every cell to 1. Because of this, a successful recovery of your accidentally deleted files would require a healthy backup.

As an addition to my prior post; scraping from a SSD is not possible, this is only possible on a HDD. When a sector on a HDD becomes unstable, the drive may fail to read from it periodically but not indefinitely, so trying to read a failing sector multiple times makes sense since it may eventually succeed. Doing the same thing with a SSD makes no sense since a flash cell is (simplified) either good or bad.

d0rk · November 12, 2022, 6:24am

ddrescue is the better option.

it has an option to specify a log file that saves its state in the event the system shuts down, or if you happen to hit a bad area of the disk that causes the controller to stall.

may wanna check to see if your mobo supports sata hotplugging and make sure its enabled, or if you’ve got a usb enclosure or dock that is an option as well.

i don’t wanna bother with looking up the syntax but ive done it enough to have muscle memory for it, checking the manpages wouldn’t hurt, it may have useful options for your use case.

sudo ddrescue -P -f /dev/sd(x) /path/to/image/or/output/disk recovery.log

P shows a data preview so you know if its stalled, or gotten close to the end of the disk and is just copying zeros if you wanna cut it off short. If i remember correctly f is force overwrite if the destination file exists already

if the drive stalls, disconnect power , wait ~10 seconds ddrescue will exit due to input disappearing.

then replug the drive, wait a few seconds for the device node to pop back up, then re-run the ddrescue command

if you have multiple drives in your system that aren’t always mounted, or if you happen to plug in a flashdrive or something while its running you might wanna check your device node (sda,sdb,sdc) to make sure when it enumerates the drive after hotplugging it has the same node before hitting enter - if not update it to whatever the new node is, (previously sdd, now its sdf) gnome-disk-utility is a easy gui to check (lists drive model/capacity/serial and node /dev/sd(x))

repeat until dd has rescued 99.99% of the drive, or until the drive wont wake up after hotplugging or just starts clicking ( if its a platter drive)

Then run testdisk/photorec against the cloned image or output disk.

ssds don’t actually self scrub as often as youd think, due to having a finite number of writes. you can still recover stuff from SSDs - but deleting and then overwriting is a lot more damaging on a SSD because it cant erase a partial block only the entire block - so smaller files will get steamrolled quicklly.

trimming the drive essentially just updates the firmwares internal “map” of allocated blocks to match up with the MFT.

the fact that the drive disconnects when it gets to recup_dir.10 is either the drive having a issue - or the drive is fine and you’re running out of space on your recovery directory. -also- you should be saving the recovered data to a physically different drive (or partition) than the one your recovering it from - if not you’re just overwriting stuff you’re hoping to recover

cloning only free space “is” possible, but in most pratical cases isn’t. disks don’t write from start to finish, especially SSDs.

say for example you’re writing a 500MB file, the drive is going to lay it down on in the first spot where there is 500MB of free space.
because files are created and deleted at different times (temp files/etc)- you end up with fragmentation.

on ssds it gets more complicated because the firmware on the drive tries to “wear” the flash cells evenly, so it sort of does try to start at the beginning and go to the end - and wraps around. but again files are created and deleted in different orders so its not guaranteed that free space lies after the last file written to disk.

if the drive were defragmented offline and compacted or made contiguous beforehand, you totally could just clone the free space on the drive with regular DD. there are tools for vms/multiboot/odd use cases that can pack a filesystem down to where everything is laid out with no gaps and then just a big block of free space at the end of the drive… but running a tool like that after an accidental deletion/filesystem corruption/“bad event” is almost going to guarantee you aren’t going to recover anything useful.

on that thought though- photorec, if dealing with a filesystem it knows how to work with well enough will offer an option to try to recover files only from free space on the drive.

testdisk/photorec support a lot of “optional” stuff depending on what development libraries are installed at build time. IIRC most debian/ubuntu builds of testdisk/photorec don’t support hfs all that well - install the dev packages and then build testdisk from source - may wanna check the package recommends for hints or the testdisk wiki.

Molehill · November 12, 2022, 6:51am

Thanks for the lengthy replay.

I have started cloning with,
sudo ddrescue -d -n -r3 /dev/sd(x) /path/to/image.img /path/to/image.log
I did try running without -d, but it disconnected the SSD I’m trying to clone, but with -d it seems to be working fine.

I’ll report back once the clone is done.

d0rk · November 12, 2022, 6:58am

yw, hope all goes well.

because of that whole “everything is a file” principle in linux/unix you can totally run fdisk and fsck on the image.
you cant specify a partition (sda3) because you’re working with a file on disk, use the offset it starts at, use fdisk to get that and then loopback

if fsck can repair the filesystem it’ll make it easier for testdisk to recover your data with the convenient things like folder structure and file names, instead of the spamheap of files named in the order they were found that photorec will give you to sift through.

theres a python script on the testdisk wiki (after running photorec) that’ll sort the recovered files by filetype (doesn’t do mime/magic numbers check, just the 3 letter extension) so occasionally you’ll wind up with a folder full of .zip files that are actually .docx and the reverse. the file command actually looks at the magic numbers and can sometimes differentiate edge cases like this.

If you don’t have any important PDFs you’d like to recover - tell photorec to not try to recover those. they’re bad for false positives. PDFs seem to be weird frankenfiles that can have all other sorts of crap embedded in them that causes photorec to “recover” files it thinks are pdfs that are multiple gigabytes in size and are always corrupt.

That being said, if you’re in bad need of your files, photorec will deliver when almost everything else fails.

Molehill · November 14, 2022, 9:24am

The ddrescue clone finished.

I was wondering if there is a way of double checking that it is a full clone?
I know it should be, but I want to put this drive back in and start using it again but don’t want to mess up everything.

I did mount the clone to check, and idea of running a diff to make sure all the file were cloned on the encrypted home partition are on the clone but I think that would just be to comfort me.

Danese · November 14, 2022, 2:30pm

Since you have used the “-d” option, there is very little chance that ddrescue would not have reported a bad read and thus the image must be considered a perfect clone, unless the hardware is borked in which case a verification would not make sense.

You could, however, make a checksum of both the drive and the image file and compare the results, sha256sum -b would be a good choice.