TrueNAS iSCSI Questions - How do I get 100% of the space?

Hey all, first post here because all the other forums are dead!

I have a TrueNAS Core system I use for SMB and NFS, but so far I’ve never really used it for iSCSI

In my ESXi box I have a 3.2TB NVMe Micron 9100 PRO SSD in HHHL Format. Its just a local datastore in ESXi, which is limiting for my goal of replacing my larger ESXi system with many smaller systems

Assuming its compatible, I want to move the SSD into TrueNAS, and share it out with iSCSI to my ESXi box

I got a spare 800GB Intel SSD and made a pool with it as the only device, giving me around 720GiB of available space. Then I followed the guides to get iSCSI going which involved creating a zvol, which is the iSCSI target. I winged it at first and just made it almost the entire size of the pool, which now gives me warnings

image

After looking it up a little, everyone suggests only setting it to 80% of the pool. Well, on a 3.2TB SSD that’s over 600GB! Is there a trick to getting 100% of the space without performance issues? Or do I just have to give up that 600GB? In ESXi there is seemingly no issue using almost all the space with VMFS

With any filesystem performance will degrade as your Pool filled more and this is especially true under ZFS and TrueNAS.

You wouldn’t want to allocate more than 80% to the ZVOL under almost any circumstance.

For more efficiently using storage space, you’d have to choose VM Disks and Share Types that allow for thin provisioning (e.g. if you want to have more than just that ZVOL on your SSD Pool)

2 Likes

What @felixthecat said. And you also have to leave space for snapshots and stuff.

 zfs create -sV 2600G mypool/zvolname

would create a sparse (aka thin provisioning) zvol that only uses space it actually needs atm. But when talking about capacity concerns, a zvol isn’t your best friend.

The closer you get to capacity limit, the higher the fragmentation will be (and amplified by low record/block size data like zvols). You can mitigate a lot by proper caching (ARC), but every filesystem suffers from this, especially CoW filesystems like ZFS.

But making a pool with a single disk with only one zvol at 100% isn’t really what you do with ZFS. If you want ESXi to have exclusive access over this drive, then just plug it into the ESXi machine. Way less overhead, more performance, less headache, less management.

Did you test performance via your other pool? Capacity concerns should be less important there and proper caching should provide good performance too.

1 Like

Thanks guys, good information

So, if I make a thin Zvol and give it 80% (Or maybe sneak it up to 90%…) I think I should be good?

This pool will have nothing other than the iSCSI target on it, and I will not be utilizing snapshots either. All my VM’s are also already thin

Will fragmentation affect purely storage speed? And will that fix itself in the background? The chances of me getting this SSD completely full is not very likely, however I like having the extra buffer room to prevent a VM going wild and filling the datastore and impacting other VM’s. Perhaps I should make 2 Zvols, one for “Home Production” VM’s that can’t go down, and the other for my less important VM’s

The single disk with one zvol I know isn’t ideal, but this SSD has its own built in redundancy and everything is backed up, so I’m not too concerned about the SSD failing. I would buy a second, but they are wildly expensive

So far I’ve tested the performance of three different pools with iSCSI

a Single Intel DC S3700, 100% zvol shared out - Performance was great. Incoming vMotions over iSCSI (With only 1500MTU) maxed out at ~1000MiB/s which is probably hitting the limit of the NIC in the ESXi box (I only gave iSCSI a single 10G Port). The SSD isn’t that fast, so I assume it was using RAM a lot too. Very impressed

2 x 8TB 7.2K RPM SAS Disks in a mirror - Performance was as expected, garbage. Incoming vMotions would start out kinda fast, and then go down to just 55MiB/s incoming on the NIC

Then my main data pool for important data. I built this just over a year ago hoping for good performance after coming away from a very slow Synology. Its 6 x 4TB Disks being a mix of SAS and SATA in striped mirrors, 1TB Samsung 970 NVMe SSD for l2arc, and then 2 x Intel DC S3700 SSD’s in a mirror for metadata. This pool has exceeded my expectations. So far every time I copy large or small files from or to my desktop PC over 10G, I pretty much hit 1GB/s in Windows Explorer, and listing directories etc is lightning. Night and day compared to the Synology.

When I put some VM’s on this pool, they actually performed pretty well, and the vMotion was quite fast, faster than I expected.

I agree leaving the SSD in the ESXi box might be the easiest way, but my goal here is to start to downsize that machine. It has 2 x E5-2680 V4’s, and the ONLY purpose of the second CPU is to get me the extra PCI-E lanes. If I take this SSD out, I can ditch a CPU and 128GB of unused memory and save on power and more importantly, heat. The ESXi box also has a virtual TrueNAS setup which stores my unimportant data with 12 x 8TB SAS disks in the front, and when I added those I had to start increasing fan speed to cool the CPU’s. Any more fan speed and its a little too loud

My end goal with this project is to either turn that chassis into a JBOD or buy a new JBOD, and attach it to my main TrueNAS box we are talking about here. Then slowly move to an ESXi setup with 3 much more power efficient servers, hence the shared storage being important.

Do not go above 80%. Past 80% the algo for free space allocation changes.

In ZFS land, once you’ve reached 80% full, you’ve hit 100%. (write) Performance starts to drop off a cliff.

IMO you should set it to 70% and leave yourself 10% for snapshots, and just forget about the rest.

Well yeah, because you have the write IOPS of one HDD. The reads however will be equivalent to that of the combined drives in the vdev.

Edit: The following is from the Storage - Oracle ZFS Storage Appliance Administration Guide

When allocating raw storage to pools, keep in mind that filling pools completely will result in significantly reduced performance, especially when writing to shares or LUNs. These effects typically become noticeable once the pool exceeds 80% full, and can be significant when the pool exceeds 90% full. Therefore, best results will be obtained by overprovisioning by approximately 20%.

1 Like

Good information. My main pool is at 65%, so perhaps I should look at increasing that soon too

If I have 3 x 4TB mirrors, all made of 4TB disks, can I swap a single mirror to gain space, or do I need to make them all 8TB disk before I gain space?

You can just plug in e.g. additional 2x16TB drives and add them as a mirror,done.

zpool add tank mirror disk1 disk2

Mirrors are easy mode in ZFS (using them myself)

Same process for replacing? Ideally I’d like to keep my current 6 Disk footprint to give me room to use the other 6 bays possibly for SSD

Offline and pull one drive → plug in new drive → wait for resilver to complete.

Repeat with other side of mirror → autoexpand should take care of available space automatically. Otherwise use expand feature manually.

8 disks = 4 stripes > 33% more performance. The reason for going striped mirrors in the first place.

edit: alternatively you can attach the new disks to a mirror (4-way-mirror now!) and then detach the old drives. a bit safer and faster

The 80% full = full metric is particularly relevant to ZFS but almost ALL file systems start having issues around there. Some just creep up slower than others.

Either performance due to fragmentation increase, no space for proper copy on write, etc.

As an extreme example I’ve seen ntfs get so badly fragmented due to running so close to full for years that it showed 10% free space that I couldn’t write to until I defragged it. Was an old win2k box that had been un maintained for years.

Exact same 80% advice applies to Netapp arrays for instance. For the same reason: they do copy on write like ZFS.

Alright, I feel much better about the 80% thing now. I went and zeroed out all the empty space of all my VM’s, and then did the holepunch command on the ESXi server, and I’ve reduced my total storage space of VM’s down to 800GB

Apart from the single point of failure, is there any other downsides to using a single drive for this? I’m really not concerned if the drive dies, as all these VM’s are backed up every day

I’ve also been looking at SSD’s to make a 6 drive array out of instead of the Micron 9100 Pro. Question, if I have:

2 x 2TB MIRROR ----- 2 x 1TB MIRROR ----- 2 x 1TB MIRROR in an array, does the fact that one of the mirrors is using larger drives affect performance?

I have assumed that in a stiped mirror situation the data is striped across all drives. So does that not mean that the larger drives will have worse performance? Or am I not understanding?

Integrity. You won’t be able to correct corrupt data with just a single drive.

ZFS should balance out the load across all vdevs. But IOPS improve as expected and sequential reads/writes won’t be worse than before, but probably faster on average.

You can test this yourself. With only mirrors, you can remove an entire mirror/vdev from the pool at any time (use zpool remove)if you don’t see a benefit or need the drive bays for SSDs.

The larger drive will get proportionally more writes than the others, so write speed will not see a +50% as expected by going from 2 to 3 stripes. Reads will depend on where the data has been written to before.

Yeah, I got the same thing. I think that’s just how ZFS works. It’s copy on write so any time you write it’s to empty disk. Within your ZVOL the file system and format is whatever the user of the iSCSI drive wants to do. Outside of that ZFS is still going to do it’s COW thing.

I’ve had a few doubts if this is the correct approach. If you think about how you might normally use a ZFS server you’d create one huge pool with all the drives you have then divii it up using ZVOLs or whatnot. You’d never allocate all the space. If you needed 800GB it would be a small slice out of your massive pool.

You’re better off leaving the 800GB SSD in the machine that’s actually using it. Or if you want an SSD pool to share out then buy some 2TB SSDs and make a big pool and not use all of it.

As an aside: storage and the issues you encounter with various file systems reinforces more the more you learn that “file storage is hard”.

It’s one of those seemingly simple problems that gets more and more complex the more you think about it; every system makes various trade offs.

And the more you learn about storage and how expensive it can get, you wonder “Just what IS Google doing with all the files stored on free or cheap unlimited storage?”

Surely Google or Amazon would NEVER do anything nefarious with our data… they just want to help us out

1 Like