Petabyte Project LTT - Anyone know how its configured / Tutorial?

KenPC · May 22, 2017, 6:09pm

Was looking into a good storage server for a workplace and was wondering if anyone wrote up a tutorial on how the PB Project is configured.

All I know is it's running CentOS

SgtAwesomesauce · May 22, 2017, 6:27pm

From what I understand, they're using GlusterFS on ZFS. Pretty sure Linus mentioned he split the drives into multiple small raidz (1,2,3 not sure) groups within a single pool and pointed gluster to one of the datasets in there.

Not the fastest option, but at that scale, it doesn't really matter, it will easily saturate 10GbE. It's definitely built for reliability.

KenPC · May 22, 2017, 6:54pm

I appreciate the help, although isn't helping me out a whole lot.

Any idea where I can find a Tut online on how to do something like this to build it out for reliability?

SgtAwesomesauce · May 22, 2017, 6:58pm

It's going to take a lot of research to do this since this is getting into the whole "cloud" area.

http://gluster.readthedocs.io/en/latest/Quick-Start-Guide/Quickstart/

This should get you started.

What sort of workload does the client have? Gluster isn't always the best choice.

EDIT: What level of reliability are you looking for. Do you want to protect against 2-drive failure or do you want to protect against an entire rack going up in flames?

Trooper_ish · May 23, 2017, 2:31pm

Pretty sure the guy from 45 drives helped him set it up with glusterFS.
I think his name was [edit Brett Kelly]
I saw he did a few vids on YT on the 45 drives channel, but it may have been more a demonstration of how well glusterFS holds up against failures...
[EDIT- the guys name is Brett Kelly]

KenPC · May 23, 2017, 2:35pm

yeah i saw all of that. didn't help much.

Trooper_ish · May 23, 2017, 2:37pm

On mobile, but I saw he did a vid-

But at work, so didn't watch.
Sorry in advance if it still doesn't help

nx2l · May 23, 2017, 6:08pm

This might also come in handy??

hanshenrik · July 13, 2019, 6:04pm

i think a btrfs single raid is much easier to set up than zfs+glusterfs, and should work practically the same (btrfs can do both glusterfs and zfs’s job), basically to get started, have 2 drives ready to trash and then:

mkfs.btrfs -m raid1 -d single /dev/disk1 /dev/disk2;

partprobe;

(if the drives are not empty then you need to add -f, like mkfs.btrfs -f -m raid1 -d single /dev/disk1 /dev/disk2;, but make absolutely sure that none of the drives are actually mounted prior to using -f )

and to mount it:

mount /dev/disk1 /raid_mountpoint;

(you dont need to mount disk1, you can mount any of the disks belonging to the raid, btrfs will take care of the rest)

and when you want to add a new drive:

btrfs device add /dev/new_disk /raid_mountpoint;

(it’s an online realtime operation and cause 0 downtime, can be done with very busy raids with zero visible disruption)

and when a drive needs to be removed:

btrfs device delete /dev/bad_disk /raid_mountpoint;

this will make btrfs move all files that was on bad_disk to the other available drives, then remove bad_disk from the raid (again it’s a no-downtime operation, happening completely transparently to any application using the raid like a normal filesystem)

and this scheme should have no problem scaling to petabytes, just like LTT’s setup. (personally i’ve never scaled it beyond 80TB at 10x8TB drives myself, but i can’t imagine why it would stop at anything lower than 254 physical drives, and even that is a very conservative guess without consulting the actual source code, the real number is probably much higher. a slightly less conservative guess would be 65534 physical drives (uint16 max -1))

SgtAwesomesauce · July 13, 2019, 7:02pm

If you think that, you don’t understand glusterfs.

The barrier to entry is not much higher for zfs and you get a much more reliable filesystem. I’ve got bit 3x in 2019 by btrfs. Never again.

KenPC · July 15, 2019, 12:57am

2 years later

iogT9is8

TimHolus · July 15, 2019, 10:07am

ezgif-4-79b5d861d3c4

lesser · July 15, 2019, 12:46pm

After two years pretty sure OP is good.