damn you of all people using btrfs?
I thought you were a ZFS guru
damn you of all people using btrfs?
I’m definitely well versed in ZFS, but far from a guru.
As far as using BTRFS, ZFS is definitely better if you have more budget to fit into it’s rigid upgrade requirements, but I like the BTRFS benefit of just popping in another drive, adding it to pool and going off to the races.
is this is you main rig or server?
I’m split between striped mirrors and RAIDZ2 for this build. I think I’m just defaulting to RAID10-for-performance when it doesn’t really matter when you have 12 SSDs… so maybe RAIDZ2 is the way to go.
Currently just my main system (and laptop, because snapshot sync across them is nice). If it proves to be stable, might move the server over to it, but that makes me a bit nervous, since I’ve got nearly 20tb of data.
@oO.o My recommendation is a RAIDZ2 situation. I think you’ll be happier there.
well once Ryzen+ comes out a may be building a watercooled main rig and move my 1600x b350 setup to my server. Xeon e3-1225v3 rig to my storage rig. So SSD array may be in my future.
EKWB configurator was nice to setup everything i need for a silent/overclocking rig,
Hey, @sgtawesomesauce, this isn’t specifically an issue with this build because I’m working with drives in multiples of 6, but @carnogaunt contested the stripe size/performance ZFS issue a while back and I’d like to get your take on it:
Stripe size and performance is only an issue when you’re hitting the limits of your disks. I don’t have any solid data, but my “butt-dyno” so to speak, indicates that I get a better responsiveness and slightly more performance when working with “optimal” stripe sizes. It’s not a huge problem, but it can definitely be used to squeeze that little bit of performance out of your system.
You mean performance limits, right?
Yes, either IOPS or raw throughput.
Thanks, that was bugging me.
I was going to reply to that thread, but I distinctly remember being too drunk to make a cohesive argument.
05:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)
05:00.1 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)
They are in target mode so I can use targetd to export luns to my promox. Im pretty sure I created a thread about DIY SAN a while ago.
FWIW, I had 4,6 and 8 disk arrays of 850 pros on an LSI controller. Even with Raid0 for giggles it was underwhelming compared to a 960 pro NVMe given the complexity of it. Particularly of the OP’s use-case is a relatively small fast array over 10GbE, you are capped at ~1GB/s which pretty much any NVMe drive or pair of them can do.
Still monkeying around with it - trying to bring up a 40GbE 4xNVME ZFS cache to a 10 HDD array, but I’ve now had 2 of 10 drives require return for bad blocks (I’m just lucky I guess)
My fall-back array is an xpenology with 10GbE and 4x10T HDD and 4x512 850pro SSD cache produces 600-700MB/s typical sustained sequential throughput after the cache warmed up and frankly at the application level, its performance is indistinguishable from either a single or dual (raid0) 960Pro NVMe locally.
Doing database walks, so maybe a photo/video app is worse, but I wonder… If you limit yourself to 10GbE, then that dictates that not much in the way of disk hardware is required.
May I ask which LSI controller you are using to make use of more than 4 SSD’s? I don’t get to play with enterprise gear, so just wondering
and is this the controller card, seperate to the actual HBA card plugged into the disc caddys via breakout cables- (sff8087 to sata?)
What share are you using and is it utilizing rdma?
This is all Linux BTW - linux server, linux client, NFS mounts…
Megaraid 9361 8i - Yes, breakout cable to 4 SATA per port for a total of 8. I also have a 4-port Highpoint - RR840 - card that supports 16 SATA without an expansion card (via break-out to 4 SATA). That’s the one currently running the 10 HDD array.
re: RDMA - I’m still learning about 40GbE. I’ve done some throughput testing with IB and RDMA, but I haven’t gotten NFS + RDMA to work yet (haven’t had much time with it frankly).
Most of my 40G cards are VPI (dual mode) and dual port. IB mode (the switch) has been cantankerous - ports dropping to 2.5Gbps and lousy multi-thread performance. IPoIB without RDMA produces 12Gbps and very lumpy results in iperf3. RDMA benchmarks produce line-rate, but… Still a work in progress.
I’m going to setup a star point-to-point net for 4 compute nodes using multiple 2-port cards on the server. On the clients, I’m using the first port for Ethernet mode connected to that server and using the second in IB mode connected to the switch until I’ve worked out all the kinks of IB mode or can find a reasonably priced 40GbE switch.
In point-to-point Ethernet (I don’t yet have a 40GbE switch, only IB), I can get 32Gbps (line rate considering encoding) with 2-4 threads in iperf3 and it scales well (spreads the throughput even over the threads regardless of the number of threads and in contrast to IB mode where one thread is 12Gbps and the others are severely throttled) and with 40GbE p2p, I’ve seen multi-GB/s transfers via NFS from my NVME drives with little tuning beyond that already in place for 10GbE (these machines are all on a 10GbE switch as well).
Hey @cekim, thanks for your feedback!
Couple things that jump out at me:
I don’t know the details of your database use-case, but generally, database operations are going to be a lot different than sequential media file read/writes. Specifically, they’ll favor IOPS over sequential read/write which is the opposite from my case.
Also, have you adjusted your block size to match your database? I have heard that this greatly increases performance.
As noted above, ZOL doesn’t support TRIM yet, so that could be hurting your SSD performance…
Are you using ZFS or hardware RAID? (or something else?)
I experimented with XFS, EXT4 and ZFS. I’m using ZFS now and planning on sticking with it. I have an asus hyper x16 nvme and I’m working on getting x4x4x4x4 bifurcation setup on that server (Asus Z10PE hides this, Asrock’s equivalent EP2C612 exposes this option in the BIOS by default, so Asus BIOS needs some cough adjustment). I’m going to use that to provide SSD cache to ZFS in the form of 4x960pro 512G NVMe.
With the LSI card and 850pro’s, it saturated at about 6 SSDs producing 2500-3000MB/s in extremely optimized test-cases. More typically, it would top out at 1.5-2GB/s for bursty access in raid0. There was a little difference between xfs and ext4, but nothing huge.
I saw similar behavior of the high-point before moving that over to the HDD array. They both delivered as promised and could saturate the 10GbE link, but then the 960pro showed up and did roughly the same with a single stick as 6-8 SATA disks.
I wasn’t trying to suggest the SATA SSDs performed poorly, they scaled pretty much as advertised up to the limit of the HBA and easily saturated a 10GbE link with sequential, large block access. Just pointing out that now that 1,2,3GB/s NVMe are out there, 6 or 8 SATA SSDs no longer make much sense if its just speed you are after (space is another matter given cost/GB). Raid1 2xNVME or Raid0 and backup or just a single 2T stick and call it a day with a much simpler setup.
In my case, the storage requirements aren’t negligible, just not what I’d want to use spinning drives for. Currently I’m planning on 2 RAIDZ2 arrays of 12 500GB 860 Evos (~10TB of space) with 64GB of RAM. I’m getting a good deal on a used box at the price of having to settle for a SAS-2 backplane and HBA, so the throughput is going to be pegged at 2400 anyway.
I’m not committed to the RAIDZ2 config, but I figure striped mirrors would be overkill for the SAS-2 bandwidth. If I can get respectable speeds out of 2 bonded 10GbE lines, then I’m good.