ZFS With Many Multipath Drives

Background

I have 23 x 8 TB drives in an old Netapp DS4246 with two IOM6 controllers that are both wired by a single connection to one of the two ports on my single HP H221 9207-8e HBA card sitting in a Ubuntu box (which it turns out is actually a rebranded 9205-8e that would only work properly after flashing over the shipped firmware with older firmware for the 9205-8e and wiping the boot support, but that was yesterday’s problem). Because I have both IOM6 controllers directly connected, each drive reports twice in lsblk or fdisk -l. I enabled multipathing with multipath -v2, and multipath -ll now correctly shows only 23 x 7.3 T drives.

Problem

Attempting to create a zpool with any n > 1 number of multipathed drives (e.g. /dev/mapper/mpatha) immediately results in an internal error: out of memory. The system has 64GB of memory, around 1% of which is being used at idle prior to attempting this operation.

Question

I have found similar-sounding(?) issues referenced in SVN commits (see bug 226096 for zfs on FreeBSD Bugzilla), and it’s not clear to me that they were ever addressed, but I am not familiar enough with ZFS to say for certain that this is the same issue. In any case the workaround discussed therein had no effect. Is 64GB actually an insufficient amount of memory for this array? Have I run into some limitation/bug related to using large multipath arrays with ZFS? Given that high performance is not critical for my use case, should I just abandon ZFS altogether and use some software RAID0 solution?

I’ve solved many problems lurking here over the years, and I appreciate the wisdom y’all have. Thanks in advance for any insight you can provide.

some config files?
/etc/multipath.conf
and
output from multipath -ll
?

Output from multipath -ll

mpatha (35000cca2545022cc) dm-1 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:0:0  sdb 8:16   active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:24:0 sdy 65:128 active ready running
mpathb (35000cca254273dbc) dm-2 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:1:0  sdc 8:32   active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:25:0 sdz 65:144 active ready running
mpathc (35000cca2542514b0) dm-29 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:2:0  sdd  8:48   active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:26:0 sdaa 65:160 active ready running
mpathd (35000cca26108b2f0) dm-33 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:3:0  sde  8:64   active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:27:0 sdab 65:176 active ready running
mpathe (35000cca25426c194) dm-34 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:4:0  sdf  8:80   active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:28:0 sdac 65:192 active ready running
mpathf (35000cca25453ea08) dm-37 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:5:0  sdg  8:96   active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:29:0 sdad 65:208 active ready running
mpathg (35000cca254b0f350) dm-40 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:6:0  sdh  8:112  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:30:0 sdae 65:224 active ready running
mpathh (35000cca254273db0) dm-43 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:7:0  sdi  8:128  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:31:0 sdaf 65:240 active ready running
mpathi (35000cca2545070c4) dm-44 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:8:0  sdj  8:144 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:32:0 sdag 66:0  active ready running
mpathj (35000cca254268be8) dm-47 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:9:0  sdk  8:160 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:33:0 sdah 66:16 active ready running
mpathk (35000cca26107f828) dm-3 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:10:0 sdl  8:176 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:34:0 sdai 66:32 active ready running
mpathl (35000cca254268b10) dm-4 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:11:0 sdm  8:192 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:35:0 sdaj 66:48 active ready running
mpathm (35000cca23bd37ad0) dm-5 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:12:0 sdn  8:208 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:36:0 sdak 66:64 active ready running
mpathn (35000cca261087890) dm-6 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:13:0 sdo  8:224 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:37:0 sdal 66:80 active ready running
mpatho (35000cca254273e64) dm-13 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:14:0 sdp  8:240 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:38:0 sdam 66:96 active ready running
mpathp (35000cca254273d84) dm-16 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:15:0 sdq  65:0   active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:39:0 sdan 66:112 active ready running
mpathq (35000cca25420efd0) dm-19 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:16:0 sdr  65:16  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:40:0 sdao 66:128 active ready running
mpathr (35000cca254268aec) dm-22 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:17:0 sds  65:32  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:41:0 sdap 66:144 active ready running
mpaths (35000cca2544bfe14) dm-25 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:18:0 sdt  65:48  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:42:0 sdaq 66:160 active ready running
mpatht (35000cca2545090fc) dm-28 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:19:0 sdu  65:64  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:43:0 sdar 66:176 active ready running
mpathu (35000cca254220684) dm-30 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:20:0 sdv  65:80  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:44:0 sdas 66:192 active ready running
mpathv (35000cca254506d24) dm-31 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:21:0 sdw  65:96  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:45:0 sdat 66:208 active ready running
mpathw (35000cca2544fe554) dm-32 HGST,HUH728080AL5205
size=7.3T features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 7:0:22:0 sdx  65:112 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:46:0 sdau 66:224 active ready running

multipath.conf

defaults {
    user_friendly_names yes
}

blacklist {
    devnode "^(ram|zram|raw|loop|fd|md|dm-|sr|scd|st|dcssblk)[0-9]"
    devnode "^(td|hd|vd)[a-z]"
    devnode "^cciss!c[0-9]d[0-9]*"
}

So (forgive any obvious statement, not familiar with the Netapp shelves), you are active/passive on the shelf controller.
Do you get the same error if you only connect one cable?

The units should be plug and play.
You having to run a different firmware on the HBA would me my first suspect … have you access to any other HBA to try out?

Thanks for the thoughts Matt.

I ended up discovering that some of the 8TB HGST drives I was using had protection enabled which prevented them from being fully formatted. Printed on each drive is a PSID and corresponding QR code that can be used with sedutil-cli --PSIDrevertAdminSP <PSID> /dev/XXX to remove the protection.

3 Likes

I am so confused when reading this.
Why would you buy a hdd you can’t format ?
Given the price, those aren’t proprietary NetApp drive so … could you share where did you read about it so I can read it to please ?

Good catch :slight_smile: the error thrown by zfs could have been a little more insightful though :slight_smile:

2 Likes

Just be aware that multipathing has a small quirk where each path appears as a drive in /dev. So for a filesystem that does auto-scanning it may pick up multiple of the same drive via each path.

I have no idea if this will be an issue for ZFS itself. But if it is I believe you can do something like “zfs import -d /dev/mapper tank” to limit what block devices it tries to auto-discover and import.

The solution is explained in the last post on this thread: [SOLVED] Help with a SAS Drive - General Support - Unraid

finally figured this one out in case it helps anyone.

 

the issue I was have was that the disk would not format due to the protection so I needed to disable it.  there is a PSID printed on the disk.  i used this to reset the disk, by usng the below command where <PSIDNODASHS> is the PSID on the disk and device> is the device

 

sedutil-cli --PSIDrevertAdminSP <PSIDNODASHS> /dev/<device>

 

I was then able to run the format command and Type 2 Protection was now removed.
2 Likes
Does not help the discussion. Please disregard

Separate to the main thread, one might buy drives that have a non standard byte size, and Wendell did a post on formatting the drives back to a more standard sector size. It’s a good way to pick up a bargain if they are on sale.

With the out of memory error OP encountered, I would have said it might be a different memory than main memory, but have no idea about raid cache memory, and it seems I would not have helped anyway. Welp, glad OP figured it out…

1 Like