Should it give confidence that data is not corrupted in a 7zip file, if it opens successfully and displays the contents? Same question for disk images - I only wonder as it must be in pretty good shape if it opens and displays file/folder directories…or is this optimistic?
You probably should run the test function. IIRC I only get errors from a bad archive during extraction, unless the error/corruption is in the file list itself.
For any archive format other than a TAR, there will usually be a header or footer of some sort to list the files. That means opening one will read only enough bytes in a specific area to list the contents. That tells you nothing about the integrity of the decompressed payload.
For disk images, that depends on the file system. A UDF-formatted ISO can have file system metadata mirrored at two different locations to guard against disc scratches. Other ones might be more flexible, mixing file system data and metadata. In my experience, a disk can mount successfully but you won’t know about corruption until you try to list a specific directory or read a file.
I don’t remember exactly how 7zip behaves, but winrar was able to show the contents of the archive despite some corrupted data, usually the problem was visible when extracting the archive.
But even then, theoretically, we can’t say we’re 110% sure.
Checksums… always and everywhere will be your friend. But this must be done as soon as the resource is created.
Disk images, in the sense of backups or iso?
Same principle, use software that has integrity checks in it… and checksums regardless.
Having checksums will not only help us to assess whether specific data is 100% correct, but it can be used to observe whether someone has modified any files on the disk.
nope. it just means the meta data and file table of contents are uncorrupted.
you will have to try to extract the archive to know if its actually corrupt…
that being said some archivers will open the files but fail the extract due to an unsupported compression.
so if 7zip fails try extracting with something like winrar which supports more compression methods.
I don’t know if anyone necessarily mentioned it, but the files could have been corrupted prior to compression. So successful decompression could just mean unpacking a corrupted file as well.
Ah, it’s a shame I didn’t know this before…lesson learnt!
In this case I’m talking about Macrium Reflect images, normally broken into multiple files not exceeding 4.25GB (and old habit from the DVD recordable days!).
Your suspicions are correct, they tend to be 100GB+.
Something I am interested in is the Macrium reflect files - I have a huge number of backups of older files. It would be quite nice to delete these, once I know my current version is safe. I guess what I would need to do is compare the copies, but again I’m guessing that I would need to know the checksum BEFORE hand…which I don’t know. I’m certainly going to teach myself about checksums though, as they sound like a huge necessity - and should go nicely hand in hand with ZFS
ZFS has built in its magic checks to avoid just such situations or even bit rot…
But I personally would independently keep my own additional hash files, yes in case of bit corruption they will not save your data but will let you know that the data is damaged/altered from the original.
When you perform the hash is important in the sense that if the data is from a year ago, the sums of checks performed today will only testify to the state of the data from today and not from a year ago when you created it.
So you can create yourself checksum files today provided you know that the data you have today is still ok. Such a case of Schrödinger’s Cat
In the old days of the warez scene, it was required to create sfv files containing checksums of all files belonging to the famous 14/15MB parts rar archive. Because without checksums you would never know if the data was downloaded correctly without a deeper inspection.
Yes ZFS is pretty good, part of the reason I switched to it.
Thank you for this Tim, is there a good app you’d recommend for easily comparing multiple files? So far I only know how to do individual files using Windows Powershell, but I’m sure there’s a better way!