L1T 172TB ZFS storage server - the details

Hello!

I’m making this thread in response to the various requests I’ve seen in my own journey for wanting to know the configuration specifics for the Level1 172TB ZFS SAN. I’ve seen scattered threads with various replies with small bits and pieces of details but nothing concise or informative.

In the video @wendell mentions/shows a bunch of stuff. There’s a bunch of guessing for a lot of that stuff but it would be really nice if we had specifics so that we had the option to recreate it on a similar/larger/smaller scale, or even pivot from that reference point.

My questions are, but are not limited to:

  • What disk shelves were used? Wendell mentions rebranded LSI shelves but they look like DS4243/DS4246 shelves. The difference for these lie in the SAS modules that they ship with (3Gbps per channel as apposed to 6Gbps per channel)
  • What HBA/SAS controller was used? Again, LSI was mentioned but specific model, firmware, and configuration would be amazing.
  • ZFS configuration! Pools, vdevs, vdev types, compression, caches, everything!
  • OS configuration. Wendell mentions in another thread that the disk shelves are configured for active/active. I assume this is with something like multipath? Or is this a disk-shelf-specific thing? Would be keen to know how this was done either way.

@wendell I know you’re a busy man with a lot of M&Ms to get through but myself and I’m sure a bunch of others would be eternally grateful if you took the time to list this stuff out. I personally have a DS4243 hooked up to a H200e SAS HBA and I get nowhere near the performance and speeds you do locally.

3 Likes

How many spindles do you have?

It’s on fedora right now, and the controller is currently a 9216. The controller in the Gamers Nexus build thread (tri mode hba 9405w) is probably what I’ll move to next.

Each shelf has the sas6 controller. The 4243 shelves can be upgraded to sas6 while 4246 are sas6 out of the box.

Each shelf has 4 vdevs currently … more vdevs generally help with performance. To save power I’ve brought each shelf online as we needed space so we are only up to two shelves so far lol. And it’s still plenty fast enough to saturate 10 gig .

What sort of performance are you getting? Even the Gamers Nexus setup was barely clearing 1 gbyte/sec read with 3 vdevs and only18 drives.

Multipath is helpful to avoid bottlenecks for sure.

In terms of datasets, we have many zfs datasets. The SMB shares generally have case sensitivity turned off. The ones storing videos have compression turned off. The ones storing docker vms have compression and case sensitivity. I think I turned off synchronous writes since we have a bbu which is still somewhat dangerous. I tuned the thing that defaults to allowing 5 seconds of in-flight data to allow 30 seconds of in-flight data.

3 Likes

24 x 15k SAS, 450GB each.

Maybe its how I’m testing. I don’t have a 10G network to test so I’m doing dd tests locally, as well as copying large amounts of data for prolonged benchmarking. I never really get above 150MB/s.

How many vdevs?

12, two drives per vdev, all mirrors. I’ve also tried just one big striped pool but made no difference.

Mirrors will be facemeltingly fast. Something basic is very wrong with your setup. I could ssh in? Lol idk

Zpool status?

Host os?

One of the things I haven’t had a chance to test yet is changing the PSU configuration. My shelf came with 4 PSUs but I only have one plugged in. Maybe it’s thinking it’s running on reduced capacity/redundancy and throttling stuff? IDK.

# zpool status netapp
  pool: netapp
 state: ONLINE
  scan: none requested
config:

    NAME         STATE     READ WRITE CKSUM
    netapp       ONLINE       0     0     0
     mirror-0   ONLINE       0     0     0
       mpathaa  ONLINE       0     0     0
       mpathab  ONLINE       0     0     0
     mirror-1   ONLINE       0     0     0
       mpathac  ONLINE       0     0     0
       mpathad  ONLINE       0     0     0
     mirror-2   ONLINE       0     0     0
       mpathg   ONLINE       0     0     0
       mpathh   ONLINE       0     0     0
     mirror-3   ONLINE       0     0     0
       mpathi   ONLINE       0     0     0
       mpathj   ONLINE       0     0     0
     mirror-4   ONLINE       0     0     0
       mpathk   ONLINE       0     0     0
       mpathl   ONLINE       0     0     0
     mirror-5   ONLINE       0     0     0
       mpathm   ONLINE       0     0     0
       mpathn   ONLINE       0     0     0
     mirror-6   ONLINE       0     0     0
       mpatho   ONLINE       0     0     0
       mpathp   ONLINE       0     0     0
     mirror-7   ONLINE       0     0     0
       mpathq   ONLINE       0     0     0
       mpathr   ONLINE       0     0     0
     mirror-8   ONLINE       0     0     0
       mpaths   ONLINE       0     0     0
       mpatht   ONLINE       0     0     0
     mirror-9   ONLINE       0     0     0
       mpathu   ONLINE       0     0     0
       mpathv   ONLINE       0     0     0
     mirror-10  ONLINE       0     0     0
       mpathw   ONLINE       0     0     0
       mpathx   ONLINE       0     0     0
     mirror-11  ONLINE       0     0     0
       mpathy   ONLINE       0     0     0
       mpathz   ONLINE       0     0     0

errors: No known data errors

Debian Stretch I believe.

ohhh nooooo

so with sas drives they use a Crazy amount of power. You need at least two PSUs for it to work properly…

2 Likes

I have 4. So plug in two, remove the others? The common lore online is the DS4243 gets by with 1 plugged in just fine…

with 7200 rpm disks 1 is fine, but I’ve seen some pretty bad stuff happen with one psu trying to keep a full shelf of 15k drives going…
use all 4, why not? also the drives will overheat unless you have the blanks in for the empty bays

Ok noted. I will sort this out first thing when I get home and see if it makes any difference.

All my bays are populated so hopefully I skip experiencing that.

@wendell what are you using for automated snapshots through zfs?

the zfs auto snap script in the gamers nexus thread and cron

2 Likes

OK, I think adding the second PSU has solved the issue. One more thing though: is it normal for the top of the disk shelf to be actually hot to the touch? Like nearly to the point where I can’t hold my hand on it. I’m currently moving data to it so maybe it’s that, but thought it’s worth the question all the same.

@CandyBit if you got 4 PSU with the shelf, why not plug them in? The extra PSUs can only provide failsafe if present when the failure occurs, they do nothing if left out.

I unfortunately don’t have the plug sockets to plug them all in. I have them installed so that they can provide cooling at the very least.

Do you mean they are physically installed but not plugged into electric? In which case they are merely providing some ventilation, rather than actual cooling

I’m fairly sure that’s not the case. I started copying a large dataset to the disk shelf with 2 PSUs plugged in and things got very hot. Once I added the other 2, all the fans on the PSUs kicked on and temps dropped.

Had drives will work hot, but die quicker. I think they recommend an operating temperature of 70C, so you might want to check out fan solutions if over?
Bursts above that are fine, but hours on end might lower the life.

There is a reason enterprises use the really loud fans…

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.