Need ideas to make a large file transfer less painful

tl;dr :What options are out there to transfer data from a zfs volume to a windows machine besides samba?


Background: helping my partner consolidate her backups, which were spread out across multiple external USB drives and old computers with an unknown number of duplicates from ‘make sure I have it’ copies of copies.

Initial reconnaissance turned up a total of ~2.5 TB of data. The only storage I had with that much capacity was my NAS, so I started by consolidating everything in an SMB share. During the process, I learned a lot about bottlenecks with USB, small files on HDD arrays, SMB overhead, rsync, zfs send/recv, great stuff and well worth every minute spent. End result was that most of the backups were CTL+A copy/pastes of entire systems, so we were able to eliminate a lot of that fairly quickly and end up with ~820 GB of mostly actual data (some of which may be redundant copies) with a ton of small files in the mix to slow things down.

I have a couple TB of NVME storage on my Windows 10 machine, so now the question becomes: what is the least painful way to get ~820GB of medium/small files from my truenas box to my windows box? I know of a few options:

  • Network transfer with a samba share
    The heat death of the universe would probably happen faster than this transfer completing.

  • Mount a large USB drive in the truenas box and copy files locally
    My hangup with this method is file system support between truenas core & windows. Feels like I’d shoot myself in the foot trying to get this one actually done.

  • FreeBSD VM on windows machine, zfs send/recv, then somehow copy local data out to a partition that windows can understand
    The problem is the somehow. I have a NVME drive with more that double the required space, so this seems possible, but I kind of run into the same wall that I do with the local USB drive.

  • Just copy the backups from the original sources again
    Since I know I’m only working with <1TB of data, it’s possible to copy the data again from the original sources over to a local drive instead of involving my NAS at all. This is my rock and stick option and the only solution that I know that I could implement with my current knowledge.

The heart of my question: What options are out there to transfer data from a zfs volume to a windows machine besides samba? I’ve got plenty of spare PC + UPS available to leave things running for as long as needed.

As a write this, I realize that she spends most of her time on a Mac, so ultimately I’ll want this data in a format that is accessible from OSX. One problem at a time :dizzy_face:

Syncthing may be a solution to explore.

Depending on how frequenlty you want this and how automatic, WinSCP from the windows machine could be used.

Working only at the block level sure is better to deal with lots of files but of course more complex given the mixed OSes.

1 Like

Ltt mentioned Cho Eazy Copy

2 Likes

If you have enough free space, create a rar/zip archive without compression, and then transfer one large file, then unpack it on the NAS. Or create an archive immediately using NAS as the storage source.

If you don’t have that much space, you can test FTP or SFTP, but I don’t expect any major differences compared to SMB.

Another approach could be to use a backup program and create one large file on the NAS on the fly. But as before, I don’t think the differences will be big because we still use smb/ftp.

Another abstract method is to create a veracrypt container on a NAS of this size and make it available via smb and mount it on a windows machine, but I don’t expect any improvement… Because in the end we always deal with smb.

You can also try TeraCopy…

1 Like

From a little experience, all approaches are somewhat cumbersome. The latest TrueNAS now has the later Samba which may be faster, so worth a try. Something like Syncthing as mentioned will help confirm you have everything if you need to do it in chunks.

I assume you are creating another copy just in case rather than something along the lines of a local share.

Id be tempted by a network (Samba) approach and USB disk in parallel and see which wins!

1 Like

When you are working with TrueNAS anyway, you may as well use ZFS send/receive to replicate the data.

zfs send pool/dataset > backup.bak

Yeah that’s why send/receive is so great. Max speed all the time, because only one single large file and zfs send doesn’t care or know about files.

zfs send sends the data via standard output, so you can redirect and pipe and everything.

You can receive the backup snapshot on any zfs system. I use it to store ZFS stuff on non-ZFS systems. I prefer it to “old-school” zipping everything to store and archive (particularly because my datasets are already compressed).

If both sides have ZFS, just use send and receive. Best, easiest and fastest backup you will ever see.

1 Like

So with the backup.bak file, what will it look like when I copy that over to my windows system? Would that show up as an archive that 7zip or the like would be able to work with?

No, it’s just a file with gibberish. Can only be read by a receive on a zfs system.

And you usually don’t open a backup unless you want to restore it, so I never felt the need to open them.

1 Like

Thank you all for the software recommendations.

Syncthing seems like it would be a great solution if I was going to make a habit of fighting with large amounts of data like this. ChoEazyCopy (robocopy) and zfs send seem to be platform specific enough that going between bsd and windows isn’t possible.

My takeaway is that network transfer of individual small files is, basically, always slow and that setting yourself up to use block transfer instead is a must.

Luckily I do have plenty of excess storage capacity in this case, so tar and un-tar on both sides of the transfer look like my best option. Were thise both linux systems, I could do that seamlessly with one command, which is really cool. It seems like that might even be possible in my case if I employ WSL2.

1 Like

One of the mistakes I see in production for the millionth time. A backup that has not been tested for recovery is not a backup that should be taken seriously.

The second thing is that not every backup requires a complete restore in every case, sometimes there is a need to extract only specific files.
:slight_smile:

1 Like

You use robocopy and leave it over night, you can also use rsync (compression option will kill peformance) I have however no idea if you’ll see any difference in peformance between those two.

In MacOS you use SMB or NFS, you can also use utilities such as rsync for transferring.

Here’s what I’ve come up with, in this should tar the source files, transfer over the network, and untar on the other end without having to store a second copy:

tar -cf - SOURCE | pv | ssh REMOTE_HOST ‘tar xf - -C “DESTINATION”’ --warning=no-unknown-keyword

The destination is wsl on my windows machine. As a bonus, now I know that BSD tar and GNU tar aren’t identical. Hence the no-unknown-keyword tag.

Downside is that it’s fragile if something makes a hiccup and ssh will likely make it slower than necessary.

That’s a good point, I’ll have to pay attention to the speed and see how it goes.

EDIT: seems like it’s bound between how fast tar can work on smaller files. It does at times saturate my gigabit network, but it’s not immediately clear if this actually hugely faster.

What kind of problems do I need to watch out for as far as the fragility you mentioned?

These kinds of transfers are very prone to network latency. I wouldn’t trust it on a busy network, any Wifi or god forbid WAN connections.

And if you are not using WAN connection…why use it at all if you have a TrueNAS server? Let the storage server be the one serving storage.

The constraint is this data needs to get back into an external drive and returned to the owner in a format that they can use.

Testing the above tar-and-send option on the actual data resulted in ~2MBps transfer speed. Luckily, tar-ing the data and just copying through windows explorer is saturating my network bandwidth.

1 Like

Best you can ever hope for :wink:

1 Like

The rock and stick solution wins again lol. Just this time it was a slightly sharper rock.

If we are talking about moving large files from ZFS volume to a Windows machine, a third-party tool like GoodSync or Gs Richcopy 360 can do this job, both support SMB, FTP, and SFTP and also copy large files very fast.