Seeking Storage Pool Solution

Hello all, newish linux user here.

I just formatted my primary system and installed Xubuntu, hoping to expand on my IT related skills.
System is my daily driver with games, media server, and the occasional VM sandbox. Everything is going well with the exception of establishing a storage solution equivalent to Windows Storage Spaces. That feature allowed me to pool multiple drives of varying sizes and create a logical drive that was backed by parity.

Initially, btrfs appealed to me until I kept reading that it's not quite ready for production level work, specifically with raid5. I then turned to LVM and starting configuring it, only to realize it wouldn't allow me to utilize the total space although it recognized it all. It seemed to be treating all drive capacity the same as the smallest volume.

Is there a command in LVM that I may be overlooking, or is there another avenue I should be exploring altogether?

ZFS. its super easy to use and set up, and does what you want. Only downside is that it uses lots of ram, about 1GB per 1TB.

you can set up vdevs (pools) in mirrored (raid 1), stripped (raid 0) or in Z1, Z2, Z3, which are like raid 5/6 where you have 1 -3 parity drives in your pool. you can also mix and match vdevs so you can have two Z2 vdevs and then mirror them etc.

mind if I ask you how many drives and what their size is?

1 Like

ZFS had caught my attention, but was quickly turned away by the RAM requirements. I've got 32gb of ram at my disposal a so that's not really an issue, I'm more afraid of not knowing why it request so much as an overhead.

Drives listed below, exclude sda and sdc
Total space = ~7.9TB

NAME FSTYPE SIZE MOUNTPOINT LABEL
sda 111.8G
sdb 3.7T
└─sdb1 3.7T
sdc 465.8G
├─sdc1 vfat 190M /boot/efi
├─sdc2 swap 14.9G [SWAP]
├─sdc3 ext4 18.6G /
└─sdc4 ext4 432G /home
sdd 1.8T
└─sdd1 1.8T
sde 931.5G
└─sde1 931.5G
sdf 931.5G
└─sdf1 931.5G
sdg 698.7G
└─sdg1 698.7G

it has no much overhead because of caching and error checking. Its a file system that runs checksums to prevent dataloss.

also with 32gb of ram you're golden.

with your disks, you should create a mirror vdevs, I would make 1 vdev the 4tb, then put the 2tb and the 2 1tbs in the other vdev then mirror them.

ideally you could buy a wd my book 16TB and shuck the drives to get 8TB of raid 1 storage, its cheaper than buying the drives them selves.

1 Like

I'll look further into zfs then. Dou have any suggested guides?

read the ubuntu docs on it. its super easy to set up.

Wendel has done some videos on it? I think

I will just pitch in that it's not a requirement to have 1GB per 1TB, that's more of a worst case scenario. It depends on the data type whether it'll use that much or less. 1GB per 1TB is more of a suggested best practice than a requirement.

I think it was @Dexter_Kane that somewhere wrote up a good guide on updating with ZFS on Linux, so I hope he can pitch that in if my memory serves me correctly :D

my ZFS pool uses 16gb of ram, its 2x8TB drives.

I mean the kernel will likely give you the memory when you need it, but ZFS does love ram.

Oh I know, I've got a raid Z1 setup and it will use as much ram as you give it. But it can run with much less than 1GB per 1TB without speed slowdowns, it just doesn't love running that way.

Is there a difference between your suggestion and running z1 (raid5-like)? Am I going to hit the same issue as I did before with LVM if I go with z1?

raid 5/z1 is awful don't use it.

the process of rebuilding after a dead drive, usually will kill another drive, thus leaving you empty handed. Also you need same size drives for that

1 Like

Referencing this link. Could I achieve your suggestion with the following.

sudo zpool create freight /dev/sdb
sudo zpool add freight mirror /dev/sdd /dev/sde /dev/sdf

I'm not sure if i should include mirror on the first line since it's only one drive.

1 Like

I settled on btrfs but raid1 because it works. Sure I lose a lot of space but HD's are cheap. I bought an 4 drive bay external housing for my case and over the last 2 years I have upgraded drives and replaced faulty ones with no problems.

So BTRFS has my support only if your willing to do raid1. Benefit is that all the drives can be different sizes and you can upgrade one disk to a larger size no problem. Currently using
devid 1 size 2.73TiB used 1.98TiB path /dev/sdc
devid 2 size 1.82TiB used 1.07TiB path /dev/sde
devid 5 size 3.64TiB used 2.89TiB path /dev/sdd
devid 6 size 5.46TiB used 4.71TiB path /dev/sdf

ZFS is rock solid but you need to keep drive sizes the same for it to work. For me as a home user I dont usually have to cash to upgrade a whole pool or even a match pair of drives at a time.

1 Like

BTRFS is definitely worthy of consideration if you don't have a lots of RAM to throw at this, it's pretty stable these days. I haven't really got much experience of using ZFS in a modern raid but please, please, please DON'T use any form of RAID 5 for modern large disks. RAID 1 would be my suggestion but remember BTRFS Raid is NOT block level raid, it's merely multiple copies of files which means you can have a 3 disk RAID 1 with it ensuring 2 copies of every file across 3 disks.

I hear this alot...and putting a 6TB drive into my raid 1 BTRFS pool took more than 24hours non stop HD acess. So I can understand stupid large drives and the time to replicate them

1 Like

I've not had much success with my storage issue yet, however, I'm looking into snapraid now. Throwing money at it for additional 4tb hdds would be the easier but something something priorities.

I'll keep on debunking this :-) This is NOT a requirement. The more RAM you have, the more ZFS will cache. If you're running a single user system or don't move a lot of data you can get around with way less and never notice. My mediacenter ran ZFS with 512MB RAM for 2TB of storage, which didn't matter because all I needed was the actual file playing.

Also: Linux is not BSD. ZFS on BSD especially in the earlier days, was more demanding. Nowadays ZFS is offered as the filesystem for regular desktops when installing FreeBSD.

As to OP question: your use case is not the best for ZFS. You could do it using partitions but your IO performance might be a drama. On the other hand: combining large drives with significantly smaller ones usually will lead to hot spindles (the 4TB drive is likely to be involved with a lot of the IOs).

For ZFS I'd probably partition the 4TB drive and mirror it with the 2TB and 2x1TB drives. You'll sacrifice the 700GB one (which is 350GB after mirrored parity) and end up with 3.7TB usable space.

Or you could go funky and do:

  • mirror sdd with a 1.8T partition on sdb
  • mirror 600GB on sde, sdf, and sdg with respective partitions on sdb
  • mirror 300GB on sde with 300GB on sdf

You'll end up with about 3,9 TB usable space in the pool, all mirrored. It might not even perform much worse as the previous scenario, since ZFS is very random IO-driven due to COW and the 4TB disk will be the determining factor anyway.

RAIDZ1 options are limited because of the 4TB drive. You'll never get more than 1,8TB usable space using RAIDZ1, with worse parity. You will have unprotected 1,8TB + 700GB to play with though.

Honestly: I would not Frankenstein my storage this way :-) For this scenario I'd probably consider btrfs, as you'll want backups anyway if your data is meaningful to you, especially in this scenario. I would recommend no other parity scheme as mirrored parity for btrfs for now.

1 Like

Not really a surprise, the old rule still applies. The more spindles the better. A BTRFS balance could easily take another 24 hours but it's not that big an issue if you don't have to do it too often.