tl;dr :What options are out there to transfer data from a zfs volume to a windows machine besides samba?
Background: helping my partner consolidate her backups, which were spread out across multiple external USB drives and old computers with an unknown number of duplicates from ‘make sure I have it’ copies of copies.
Initial reconnaissance turned up a total of ~2.5 TB of data. The only storage I had with that much capacity was my NAS, so I started by consolidating everything in an SMB share. During the process, I learned a lot about bottlenecks with USB, small files on HDD arrays, SMB overhead, rsync, zfs send/recv, great stuff and well worth every minute spent. End result was that most of the backups were CTL+A copy/pastes of entire systems, so we were able to eliminate a lot of that fairly quickly and end up with ~820 GB of mostly actual data (some of which may be redundant copies) with a ton of small files in the mix to slow things down.
I have a couple TB of NVME storage on my Windows 10 machine, so now the question becomes: what is the least painful way to get ~820GB of medium/small files from my truenas box to my windows box? I know of a few options:
Network transfer with a samba share
The heat death of the universe would probably happen faster than this transfer completing.
Mount a large USB drive in the truenas box and copy files locally
My hangup with this method is file system support between truenas core & windows. Feels like I’d shoot myself in the foot trying to get this one actually done.
FreeBSD VM on windows machine, zfs send/recv, then somehow copy local data out to a partition that windows can understand
The problem is the somehow. I have a NVME drive with more that double the required space, so this seems possible, but I kind of run into the same wall that I do with the local USB drive.
Just copy the backups from the original sources again
Since I know I’m only working with <1TB of data, it’s possible to copy the data again from the original sources over to a local drive instead of involving my NAS at all. This is my rock and stick option and the only solution that I know that I could implement with my current knowledge.
The heart of my question: What options are out there to transfer data from a zfs volume to a windows machine besides samba? I’ve got plenty of spare PC + UPS available to leave things running for as long as needed.
As a write this, I realize that she spends most of her time on a Mac, so ultimately I’ll want this data in a format that is accessible from OSX. One problem at a time
If you have enough free space, create a rar/zip archive without compression, and then transfer one large file, then unpack it on the NAS. Or create an archive immediately using NAS as the storage source.
If you don’t have that much space, you can test FTP or SFTP, but I don’t expect any major differences compared to SMB.
Another approach could be to use a backup program and create one large file on the NAS on the fly. But as before, I don’t think the differences will be big because we still use smb/ftp.
Another abstract method is to create a veracrypt container on a NAS of this size and make it available via smb and mount it on a windows machine, but I don’t expect any improvement… Because in the end we always deal with smb.
From a little experience, all approaches are somewhat cumbersome. The latest TrueNAS now has the later Samba which may be faster, so worth a try. Something like Syncthing as mentioned will help confirm you have everything if you need to do it in chunks.
I assume you are creating another copy just in case rather than something along the lines of a local share.
Id be tempted by a network (Samba) approach and USB disk in parallel and see which wins!
When you are working with TrueNAS anyway, you may as well use ZFS send/receive to replicate the data.
zfs send pool/dataset > backup.bak
Yeah that’s why send/receive is so great. Max speed all the time, because only one single large file and zfs send doesn’t care or know about files.
zfs send sends the data via standard output, so you can redirect and pipe and everything.
You can receive the backup snapshot on any zfs system. I use it to store ZFS stuff on non-ZFS systems. I prefer it to “old-school” zipping everything to store and archive (particularly because my datasets are already compressed).
If both sides have ZFS, just use send and receive. Best, easiest and fastest backup you will ever see.
So with the backup.bak file, what will it look like when I copy that over to my windows system? Would that show up as an archive that 7zip or the like would be able to work with?
Syncthing seems like it would be a great solution if I was going to make a habit of fighting with large amounts of data like this. ChoEazyCopy (robocopy) and zfs send seem to be platform specific enough that going between bsd and windows isn’t possible.
My takeaway is that network transfer of individual small files is, basically, always slow and that setting yourself up to use block transfer instead is a must.
Luckily I do have plenty of excess storage capacity in this case, so tar and un-tar on both sides of the transfer look like my best option. Were thise both linux systems, I could do that seamlessly with one command, which is really cool. It seems like that might even be possible in my case if I employ WSL2.
One of the mistakes I see in production for the millionth time. A backup that has not been tested for recovery is not a backup that should be taken seriously.
The second thing is that not every backup requires a complete restore in every case, sometimes there is a need to extract only specific files.
You use robocopy and leave it over night, you can also use rsync (compression option will kill peformance) I have however no idea if you’ll see any difference in peformance between those two.
In MacOS you use SMB or NFS, you can also use utilities such as rsync for transferring.
Here’s what I’ve come up with, in this should tar the source files, transfer over the network, and untar on the other end without having to store a second copy:
That’s a good point, I’ll have to pay attention to the speed and see how it goes.
EDIT: seems like it’s bound between how fast tar can work on smaller files. It does at times saturate my gigabit network, but it’s not immediately clear if this actually hugely faster.
What kind of problems do I need to watch out for as far as the fragility you mentioned?
Testing the above tar-and-send option on the actual data resulted in ~2MBps transfer speed. Luckily, tar-ing the data and just copying through windows explorer is saturating my network bandwidth.
If we are talking about moving large files from ZFS volume to a Windows machine, a third-party tool like GoodSync or Gs Richcopy 360 can do this job, both support SMB, FTP, and SFTP and also copy large files very fast.