How to ZFS on Dual-Actuator Mach2 drives from Seagate without Worry

The SATA greater versatility makes sense, but I would have thought the SAS controlled drives would always have the upper hand in performance due to their deeper command queue depth. Maybe SAS HBAs are doing something funny?

@twin_savage you raise a very valid point, one that I thought would factor more highly in my evals - but it did not for any of the clustered type file systems I’ve been testing, especially ZFS.

Now for direct attach conventional file system like NTFS or ext, I believe queue depth “might” play a more prominent role, but to be honest I don’t know why one would use this class of drive under those circumstances and so did not spend much time playing with it.

Now where things get “dicey” is the SATA product, being a single target rather than two LUNS, has to share the 32 command queue between the two actuators. What’s more, if you are careless in your layout, you can create a bit of a file system thread lock by saturating one of the two command queues with commands while other actuator is blocked from getting new commands until the first one empties the queue. You can definitely create mutex-style deadlocks here!

That being said, I find that if you create your vdevs on either “A” or “B” actuators, but never combine them to form a single (ZFS) vdev, you will never get the command queue out of balance.

So the rule of thumb for SATA (not SAS!):

  1. All “A” actuators can be combined to form a composite device (even LVM/mdadm RAID).
  2. All “B” actuators can be combined to form a composite device (even LVM/mdadm RAID).
  3. Never, ever create a composite device made up of both “A” and “B” actuators such that one can block on data from the other that can’t be requested for lack of available command queue entries.

My provisioning snippet and script just above takes care of all of that and included is some LVM examples.

2 Likes

Noob to L1tech, but since no one bother to ask. Is that possible a costumer grade (SATA only) Synology NAS could utilize the dual actuator design?

From the official faq, they seem to indicate it could be done as long as the LBAs are treated differently.

1 Like

I am working on that right now. Its not a fun experience unless you like hacking around the linux underbelly, but I think I can convince synology to make it a little easier once I get a howto put together

7 Likes

Just saw the video. I’d be interested to see what your thoughts are on cost/benefit of different pool layouts. RAID Z3 kinda sucks all the fun out of this IMO.

Mdraid is how I did 5 gigabytes/sec. Raid60 worked great too

2 Likes

It’s been available well for a year now I can buy them from 30 different vendors in EU. But they are waaaaay overpriced.

And I can only buy it in 18TB and not 20TB/22TB.

I can get two regular Exos 18TB drives for the same price as one 2x18TB. And be even faster with double the storage. No, thanks! Unless you are restricted space wise (and for some reason can’t use SSD in that case) prices would need to come way down for the Dual Actuator Mach2 drives to make sense. I really struggle to see any niche where this makes sense.

Is that second hand/used? Are you looking on eBay, or reputable sellers?

Yes, 5 year warranty, new enterprise disks from reputable sellers cost half per TB than the Exos X2.

1 Like

Thanks, there are a couple too, and Yeah, All over the expected price, twice the normal. I wonder if they are listing it expecting a “lot” of 2 pieces, without realizing it is single drives with dual activators?

Just make sure to check the serial numbers when you receive them.

Yes, been ordering for years, never a problem with serial numbers or warranty. My point is the Dual-Actuator Mach2 drives are so horribly overpriced I can’t envision any use case where it makes sense to buy them. Doubling the price is just ridiculous. And that’s not a single vendor that made a mistake with the lot size, that’s over 30 vendors so I think that’s just the price that Samsung charges in EU.
I can’t see this for NAS or ZFS where you’re probably better off just buying twice the HDDs for the same price, for more speed than a Mach2, or if you want to save power denser at 22TB instead of 16/18TB drives.
Maybe some proprietary appliance that only fits one HDD, an SSD would be too expensive, but you really, really need that extra bit of speed.

The only place where I can see a use is “I am space-constrained, I have three drive bays, I need more than three drives worth of throughput/IOPS, and I can’t afford SSDs.”

1 Like

fun fact, if you’re buying LOTS of these drives the price differential is negligible.

The hard drive market is in a weird place right now.

I bought some refurb drives in the past, helium filled, from “official” channels and have had zero failures, too. So there’s that.

high end stock being sold for ~200/drive is old “omg sell the old stock” stock because of macroeconomic conditions, I think

get while the gettings good I think

1 Like

Well the regular enterprise drives have been slowly and planably getting cheaper over the last 2 years.

While the Mach2 is still as ridiculously priced as a year ago.

I am happy that Google, Amazon and other hyperscalers can get Dual actuator disks for a negligible premium.

But until that price difference trickles through to us regular folk and small/medium businesses I’ll have to keep buying the regular Helium filled drives with one actuator.

https://www.serversupply.com/HARD%20DRIVES/SATA-6GBPS/18TB-7200RPM/SEAGATE/ST18000NM0092_368568.htm

refurb, but half off, $220 each for 18tb very tempting at this performance level… mostly these seem to be never-used customer returns. zero power on hours for the 6 I ordered so far.

5 Likes

I also ordered two from Server Part Deals (the seller on Newegg, but I ordered direct from their website) and can confirm that SMART data suggested the drives were unused.

Very appealing performance and capacity for the price.

1 Like

@Kwinz MachII only came in 14TB models, they were the previous generation from about 3 years ago.

The have very little to do with the current generation of ExosX2 except for being the first iteration.

1 Like

Good catch @bambinone. This was sponsored specifically to address issues where someone might try to stripe the two actuators in a SATA DA drive together.

The SATA interface and the 32 command queue is shared between the two actuators. If you don’t follow Wendell’s guides and stripe “A” actuators to “A” actuators and “B” to “B” actuators, you can get into a situation where the command queue is full for only one of the two actuators. This is a problem if you need data from the other actuator to complete the upper level IO transaction.

I never had an issue with this unless I was running CEPH and the crush maps resulted in data layouts had IO dependencies.

The BFQ scheduler patches ensure that you always have some space in the command queue to reach the other actuator.

4 Likes

I got my new 2x18 SAS drives in and they’re working great. However my kernel log is getting spammed with the following:

[84302.447898] mpt3sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)                                                                                                                                          
[84312.462949] mpt3sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)                                                                                                                                          
[84379.029583] mpt3sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)                                                                                                                                          
[84471.253048] mpt3sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)                                                                                                                                          
[84471.256257] mpt3sas_cm0: log_info(0x31120436): originator(PL), code(0x12), sub_code(0x0436)                                                                                                                                          
[84507.029904] mpt3sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)                                                                                                                                          
[84507.187357] mpt3sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)                                                                                                                                          
[84563.388526] mpt3sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
[84563.559160] mpt3sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
[84609.426064] mpt3sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
[84711.831236] mpt3sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)

Anybody else? SAS HBA is a bog standard SAS3008.