Potential ZFS metadata issue. Please help me fix family photos and videos! 🙏

I found some strange issues in one folder (subfolders are fine) of a ZFS dataset. I haven’t found other issues like this, but I haven’t been looking. I’m not sure if metadata got jacked or what, but notice these files?

image

The extensions are wrong. The JPGs are not 400MB, they’re MP4 files:
image

After renaming them, all the files work correctly, but then the EXIF data doesn’t match the filenames.

Correct EXIF filename

image

Incorrect EXIF filename

Filename EXIF Date Taken
20220710_132126 7/10/2022 - 1:21:14 PM
20220710_132129 7/10/2022 - 1:21:21 PM
20220710_150750 7/10/2022 - 1:21:25 PM
20220710_150755 7/10/2022 - 1:21:29 PM
20220710_150804 7/10/2022 - 3:07:50 PM

Looking at these four files and the incorrect extensions in the screenshot above, it’s clear to me that all filenames are off-by-2.

I found a folder called “possibly dupes/”, and it contains a file called 20221021_111031.mp4. Today, that file is named 20221021_111334.jpg. That’s off-by-2. So it’s true that everything’s been pushed out.

What happened? Not sure. Everything is off-by-2 in here after July 10th, 2022.

But that’s wrong because I looked back further, and some others are messed up there too. It’s inconsistent. More below.

I looked at my oldest snapshot for this data, and it’s all messed up there too, but they’re off-by-6!

It’s weird because I have less and more photos. There are 2 extra photos in the oldest snapshot and 4 extra in the latest. 2 + 4 = 6. Maybe it’s related? But if you base if off filenames, there’s an extra filename in the current data versus the snapshot.

If it’s related to the off-by-2 and off-by-6 naming, then that’d be an easy thing to verify. If I go back to older photos, I’d see those missing ones, but they’re not there! Those older photos from July 9th are all named correctly.

I haven’t renamed these files at all. They all came straight off my phone or my wife’s phone onto my Windows box, and then eventually copied onto the NAS using Robocopy or into a separate dataset using rsync. Not sure why they’re all messed up like this, but I really wanna fix them.

Other incorrect filenames I found

Notice here, these three MP4s are actually JPGs:
image

EXIF data shows they were taken the day before at 11am. That means something is definitely messed up here.

I looked at 36 files around these, and none of them are wrong. So how did these 3 images get improperly named as videos with the wrong date? That makes me think those videos are gone.

In the oldest snapshot, these images are also incorrectly named. In this case, it’s consistent between the two.

Looking further out, I found 3 JPGs that are actually MP4. They’re also incorrectly named, but they’re at least 26 files apart.
image

As a reference, the other images are correct, but this one is wrong: 20220614_133529. It’s also from June 9th like the other 3 incorrectly named MP4s.

Looking at ones a month later, sure, they’re incorrectly named here too, but the naming isn’t off-by-2:
image

And cross-checking these against my oldest snapshot, the names actually line up this time. I’m not sure what the pattern is for these being out-of-sync and having the wrong filenames.

20kB images?

I dunno why some images are <50kB. I looked through all of them and while the filenames are wrong, I found a number of them under 50kB:

These all look like photos I’d sent over Signal. The aspect ratio is 4:3 (same as I took them), but the photos are all sized down and the EXIF data is missing. This is really distressing to me because photos I send over Signal are ones that I value more than others, and it’s not like I have another copy of these :frowning:.

Stress

At this point, I’m freaking out because it’s possible I’ve lost quite a few photos. I already know I lost some because there are <50kB photos that should be 3-8MB in size. That’s really bothering me.

The earliest snapshot is from Jan 8th. The next snapshot is Jan 15th; and other than new images, those files match the current filenaming.

I don’t have a backup of this directory anymore in Windows. I have 2 other datasets where I replicate this data and both have snapshots, but both snapshots are from February, long after the initial copy.

How to fix?

First, I put a hold on these snapshots as they’d normally be deleted in a few weeks.

I also need to figure out where image names aren’t lining up. There are a few ways to figure this out:

  1. Compare the current data to the oldest snapshot and see where filenames don’t match filesizes in this directory.
  2. Figure out which JPG have EXIF and which filenames don’t match the “Date Taken” field in EXIF data.
  3. Figure out which JPG files are actually MP4 and which MP4 are actually JPG by looking at filetype metadata.

I’d also like to know why the oldest snapshot has different filenames. That makes no sense! It seems really odd to have one set of data with incorrect names and a later update of that same data with completely different names!

This makes me think ZFS’s metadata is corrupt for this directory. I do have a special device, so maybe that’s what broke it? Only for this folder in all my datasets?

The screenshots look like Windows, is this running ZFS on Windows, or is that over a network filesystem protocol?

What makes you think they were created correctly on Windows in the first place?

Do you have ZFS snapshots from different times to compare? (zfs diff snapshot1 snapshot2 | grep filename).

If they are mounted from another machine, does that machine also show the same problem on its local filesystem?

1 Like

Hi,
Did you manage to back up the data will all incramental snapshots, then find a previous snapshot which both had the files, and also the uncorrupted ones?

I presume kill or cure whatever happened, is over now…

It probably does not help to suggest taking many snapshots in case the data is messed with? I use automated snapshot making tools, so I don’t have to manually set it up, but your system is your own

Sorry about my late reply. My kids got me sick this weekend, and I’m still recovering.

This is ZFS running on TrueNAS SCALE in Linux on a dedicated system. I was looking at the files in Windows over SMB.

I have multiple snapshots, and they’re automated, but at the point of copying them over, I’m not sure if they were already messed up or not. Maybe pulling them off my phone broke 'em. Not sure. This is the first time I’ve noticed this issue, and the fact that I have a “possibly dupes” folder with the correct name and the fact that they’re messed up differently in a later snapshot makes me think maybe the ZFS special device did something.

zfs diff is taking forever to run. I mean, it’s running, but I’m just watching it slowly output just about every filename.

I’m not sure why, but there are a bunch in here in no particular order, but also, those files in other folders look fine.

It’s possible the Creation Date changed or something because I’d moved these to another drive in Windows at some point, then Robocopy’d them over again later after I finished adding more drives to the system. But I don’t remember there being any changes other than a few added photos.

Because of the COW nature of ZFS, it stores changes to existing files as a later transaction.So if the files had a different name/extension when first saved, the change would be stored as a later transaction.
One can’t always pick apart the transactions, because it will break subsequent files written to, and changed on the pool.

Saying that,I don’t use the special vdev, and presume if ZFS faulted,the data would be Dead/irretrievable not “close but wrong”

The extention change,is not a ZFS thing I’ve ever haerd of, and I would be looking at the external app that might have changed / altered the files/file extensions.

I have had a corrupt file before, and the system just refuses to load it, but does not do any extension editing

For this kind of thing where the images should be changing it makes sense to just keep snapshots forever. You can setup a separate snapshot job to take a snapshot quarterly…that never gets deleted. Then it’s easy to go back years in time.

Zfs should not change the filename extension or clobber an image from it’s original size to 1kb, that’s… Very unlikely. And it’s likely without an older snapshot at this point recovery would be impossible.

With a dataset where the data really really never changes there is no harm in keeping snapshots forever though. Just don’t have an excessive number of them

3 Likes

There was one kind of virus I encountered 20 years ago. It changed all my files to “exe” files, while keeping original icons. e.g. “kids.jpg” would be renamed to “kids.jpg.exe”. It expected an innocent user copy those infected files to another computer and double click it.

Anyone have experience with sata case fan pin out?

Just out of interest, have you tried looking at them directly from the CLI? Going to shell and cd /mnt/the-folder-in-question and ls ?

I do feel for you, can’t be at all nice if you don’t have a backup.

1 Like

I don’t think this is a ZFS issue.

After some time, I noticed that even copied off my phone, the images and video files had the wrong extensions.

I have a Samsung Galaxy Note9.

I can’t remember exactly, but I think they were even stored wrong on the phone itself. I wish I knew what caused this.

I wanna be clear though, the data and filenames were different between two snapshots. It’s unclear how that’s even possible, but the #1 thing I have to fix is those filenames.


Still, I’d like to solve it.

I need to write some sort of script to check if it’s a video file or image, check the EXIF date, and reset that as the filename.

Because some filenames are named what the next files are supposed to be named, I actually have to do this in a certain order. No multi-threading or async.

This’ll be one scary project. Glad I have snapshots at all!