Having some issues understanding some ZFS terminology (vdev vs pool vs filesystem vs volume vs dataset)

mihawk90 · December 24, 2019, 2:15pm

Hello there,

I’m currently a little bored so I am looking into ZFS a bit. I was thinking about building a new(-ish) NAS sometime again when I get the time, so this is more or less a preparation, though I probably won’t use it for a while. Anyway though… to the questions (sorry it’s going to be one of those posts).

So the first thing I didn’t quite get what the point and/or difference of a vdev compared to a pool is.
What I understand from wendell’s videos is that a vdev is a set of disks that handle their own parity, and a pool consists of one or multiple vdevs. So far so good, but I don’t quite understand the point.

If a vdev is handling its own parity, why would I throw multiple vdevs into a pool and not just use multiple pools?
From what I gather a pool can write the data over all of its vdevs, which would make sense just to increase the total storage capacity when adding drives later.
However at that point I wonder how the writing and file integrity works over multiple vdevs. Since each vdev could potentially handle its own parity differently (e.g. vdev1 is a RAID1, while vdev2 is a RAIDZ2, and vdev3 is a RAID0 for some reason), how does ZFS decide where to store the data?

For illustration: I have block of data I write to the storage pool, and the data happens to land on vdev3. Now one of the drives in vdev3 dies for inexplicable reasons without warning, what now? Since RAID0 is not exactly recoverable I’m wondering how this is handled. I mean, sure the combination doesn’t make a lot of sense to do, but it is technically possible, right?
The same goes of course for 2 drive failures in vdev1 or 3 drive failures in vdev2, because I had a bad batch of drives or whatever.

On that end: Yes I know you’re supposed to have a separate set of backups in a different location, but that’s another story.

So, going on I tend to stumble onto more terminology that I’m not quite sure what it means. In his videos wendell mentioned datasets every now and then, but I’m not quite sure what exactly a dataset is in ZFS terms.
From what I understand from the FreeBSD page “dataset” is just a generic term with no inherent meaning as it doesn’t necessarily say what kind of dataset it is?
From the word alone I would have thought that each dataset would have a different purpose (for example in a NAS, one dataset would be the mediastorage, and one would be personal files), but that doesn’t seem to be the case.

Speaking of mediastorage and personal files, would this be a reasonable use of multiple pools or would one just use a regular directory structure for that? Or is that where “filesystems” (weird name IMO since ZFS is also a filesystem?) would come in? On the aforementioned FreeBSD page it says that fileystems are mounted somewhere and act like any other filesystem (as if it were a separate disk?). So creating a filesystem and mounting it as /home (or whatever else) would fit that description.

And the last thing I stumbled were volumes. On the page above it says they are block devices, useful for creating other filesystems on top of ZFS or as iSCSI extents. So from what I can tell, they don’t have much use for a home-NAS?

So, if you read so far, thanks for the attention. I’m sorry for the long post, but ZFS seems to be a riddle for me…

Trooper_ish · December 24, 2019, 3:21pm

Boy, I wish I had time to go through this, but basically imagine a raid card, can clump drives into raids, and slap a volume on top to present to the OS.
A pool is just a bunch of software raids with a volume on top.
If any of the vdevs are raid0 (striped) When one disk bites it, the entire pool is hoses, regardless of other vdevs/parity.
That is the consensus of why not to mix types of vdevs in a pool.
The system will let you, and will let you mix solid with spinning media.
The allocation of which vdev to write to changed, and iirc, it is supposed to write new data to the one that writes quickest if there is a difference, rather than spread across the whole pool, the reasoning being that when you Add a vdev to a pool to expand it, the new vdev(being empty) should fill quicker, to get to the same level of fullness as the existing ones.
It’s more complex that that, and actual technical people will step in with real info shortly, but I would recommend the zfs book by MW Lucas and A Jude at tilted windmill- it has loads of info, and funny with it

Trooper_ish · December 24, 2019, 3:27pm

And datasets are like thin provisioned partitions, that can use up the whole of the pool, but start off using next to no space. They can also be individually copied, replicated, snap shotted, removed.
You could even make a dataset for every folder, with different settings to allow for optimised settings (larger records for big media files, smaller recordsize for a database etc) and you can set a dataset’s mount point to practically anything.
You can nest datasets too, so have one called media, unmountable, and a child dataset for each of music, tv, movies, with different settings for each.

Trooper_ish · December 24, 2019, 3:30pm

By volumes, I presume you mean ZVols? I don’t have much experience with them, but yeah, you can creat one and use it as an iscsi target, but with snapshotting (for rolling back) and stuff

mihawk90 · December 24, 2019, 5:55pm

First off thanks for the answers already.

So if I understand this right, a single file could potentially end up splitted onto multiple vdevs?

But doesn’t that sort-of contradict the information in the FreeBSD Documentation (however accurate that might be)? From what I understand there the term “dataset” could be a number of things in ZFS. From their definition what you describe would be a Filesystem (which is a type of dataset though as they describe it).

Heard of those a lot of times, but I’m not at the point where I would dive nosedeep into ZFS yet.

I guess so yes, at least from what I gathered.

Trooper_ish · December 24, 2019, 10:20pm

Yes, files are always split up, into the recordsize of the dataset, and written to the vdevs according to the ashift, spread among disks (providers) according to the parity of the pool.
The system knows where each part of the file is, but I don’t know any way to tell where each bit is.
Iirc, it won’t put the parity and data on the same provider, unless there is only one provider.
You can intentionally increase the number of copies saved, and the number of copies of metadata
For datasets, the handbook would be correct, but it confuses me, as I think of a file system. A pool would only have one, root file system, then child file systems underneath it, which can be copied over to another pool, so I find it easier thinking of datasets as “partitions“ on a pools “disk”. But now it makes less sense to me, so um, dang, I guess I’m just gonna stop…

Adubs · December 25, 2019, 2:42pm

You can and some do. Now you have 2 smaller pools.

Realistically you would avoid different levels of redundancy because your pool would only be as strong as the weakest link here. Each piece of the file is split into the vdevs evenly afaik and then checksumming is used. There’s never a reason to use raid 0 since that’s effectively what the pool does… Which is why if you lose a vdev, you lose the pool.

That’s not exactly how it works, but if you lose the vdev then the entire pool is gone. You’ll never have all of the data on one vdev unless you only have one vdev.

You would create single drive vdevs and pool them but you still wouldn’t be without checksumming. You just wouldn’t have fault tolerance and therefore a disk loss is the same. There’s not a raid0 option for the vdev afaik, that is a function of pooling. Imagine you had a raid 50 or 60, that would be equivalent to raidz1 or 2 on 2 vdevs in a single pool.

Think of it as a way to separate data on the pool. So you can have a single dataset and share that out with all your friends or you can have each friend have a dataset and no one gets mixed up. There’s more to it as well, like compression, deduplication, and some other black magic but that’s probably not as important to the home user.

It’s got advantages and disadvantages to just a simple folder Share. I think most people probably just run one dataset per pool but what do I know?

The beauty of this whole thing is you can make your setup as complicated or simple as you’d like. When I started using freenas I just did simple mirrors and a single smb share and that was good enough for me.

I don’t know much about this one but I think that gets deeper into the software defined storage than I understand and I won’t even begin to try to explain it to you because of that.

I will say that ZFS isn’t for everyone. I moved on to unraid without ZFS. I kept my freenas box and use it for backups on a raidz2 pool.