Netapp 4486 & LSI 9200-8e Issues > 2tb

My Hardware
Netapp 4486 disk shelf with 2x IOM6 (One cabled to Freenas host via QSFP - SFF8088 cable)
LSI 9200-8e (flashed with IT firmware on Freenas host)
Freenas Host: Supermicro X9 with 5640 xeons, 72gb ram

My Issue
When attempting to load in a caddy with 4/6 tb drives, I am getting the following in the dmesg log of the freenas host:

da1 at mps0 bus 0 scbus0 target 8 lun 2
da1: <HITACHI HUS724040ALE64DB NA01> Fixed Direct Access SPC-3 SCSI device
da0 at mps0 bus 0 scbus0 target 8 lun 1
da1: Serial Number PAGU83ES           
da0: <HITACHI HUS724040ALE64DB NA01> Fixed Direct Access SPC-3 SCSI device
da0: Serial Number PAGU82VS           
da0: 600.000MB/s transfersda1: 600.000MB/s transfers
da1: Command Queueing enabled
da1: Attempt to query device size failed: ABORTED COMMAND, Internal target failure
da0: Command Queueing enabled
da0: Attempt to query device size failed: ABORTED COMMAND, Internal target failure

As a test I have populated a caddy with a single 1tb drive, and did not have this problem, which makes me really nervous.:

da25 at mps0 bus 0 scbus0 target 19 lun 1
da25: <HITACHI HUS724040ALE64DB NA01> Fixed Direct Access SPC-3 SCSI device
da25: Serial Number PAGS6N6S           
da25: 600.000MB/s transfers
da25: Command Queueing enabled
da25: 953869MB (1953525168 512 byte sectors)



root@constellation:~ # camcontrol devlist
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 8 lun 0 (pass1)
<HITACHI HUS724040ALE64DB NA01>    at scbus0 target 8 lun 1 (da0,pass2)
<HITACHI HUS724040ALE64DB NA01>    at scbus0 target 8 lun 2 (da1,pass20)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 9 lun 0 (pass3)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 10 lun 0 (pass4)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 11 lun 0 (pass5)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 12 lun 0 (pass6)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 13 lun 0 (pass7)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 14 lun 0 (pass8)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 15 lun 0 (pass9)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 16 lun 0 (pass10)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 17 lun 0 (pass11)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 18 lun 0 (pass12)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 19 lun 0 (pass13)
<HITACHI HUS724040ALE64DB NA01>    at scbus0 target 19 lun 1 (da25,pass51)
<NETAPP DS448IOM6 0173>            at scbus0 target 20 lun 0 (pass14,ses0)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 21 lun 0 (pass15)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 22 lun 0 (pass16)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 23 lun 0 (pass17)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 24 lun 0 (pass18)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 25 lun 0 (pass19)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 27 lun 0 (pass21)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 28 lun 0 (pass22)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 29 lun 0 (pass23)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 30 lun 0 (pass24)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 31 lun 0 (pass25)
<MARVELL LUIGI_V2_STSB_DB 2015>    at scbus0 target 32 lun 0 (pass26)
<DOGFISH 30G Q0707A>               at scbus2 target 0 lun 0 (pass27,ada0)
<HPT DISK 0_0 4.00>                at scbus6 target 0 lun 0 (pass28,da2)
<HPT DISK 0_1 4.00>                at scbus6 target 1 lun 0 (pass29,da3)
<HPT DISK 0_2 4.00>                at scbus6 target 2 lun 0 (pass30,da4)
<HPT DISK 0_3 4.00>                at scbus6 target 3 lun 0 (pass31,da5)
<HPT DISK 0_4 4.00>                at scbus6 target 4 lun 0 (pass32,da6)
<HPT DISK 0_5 4.00>                at scbus6 target 5 lun 0 (pass33,da7)
<HPT DISK 0_6 4.00>                at scbus6 target 6 lun 0 (pass34,da8)
<HPT DISK 0_7 4.00>                at scbus6 target 7 lun 0 (pass35,da9)
<HPT DISK 0_8 4.00>                at scbus6 target 8 lun 0 (pass36,da10)
<HPT DISK 0_9 4.00>                at scbus6 target 9 lun 0 (pass37,da11)
<HPT DISK 0_10 4.00>               at scbus6 target 10 lun 0 (pass38,da12)
<HPT DISK 0_11 4.00>               at scbus6 target 11 lun 0 (pass39,da13)
<HPT DISK 0_12 4.00>               at scbus6 target 12 lun 0 (pass40,da14)
<HPT DISK 0_13 4.00>               at scbus6 target 13 lun 0 (pass41,da15)
<HPT DISK 0_14 4.00>               at scbus6 target 14 lun 0 (pass42,da16)
<HPT DISK 0_15 4.00>               at scbus6 target 15 lun 0 (pass43,da17)
<HPT DISK 0_16 4.00>               at scbus6 target 16 lun 0 (pass44,da18)
<HPT DISK 0_17 4.00>               at scbus6 target 17 lun 0 (pass45,da19)
<HPT DISK 0_18 4.00>               at scbus6 target 18 lun 0 (pass46,da20)
<HPT DISK 0_19 4.00>               at scbus6 target 19 lun 0 (pass47,da21)
<HPT DISK 0_20 4.00>               at scbus6 target 20 lun 0 (pass48,da22)
<HPT DISK 0_21 4.00>               at scbus6 target 21 lun 0 (pass49,da23)
<HPT DISK 0_22 4.00>               at scbus6 target 22 lun 0 (pass50,da24)

Is there a firmware update for the shelf IOM?

If you connect a big drive directly to the HBA is it detected correctly?

Are the drives SAS or SATA?

  1. There Likely is, Ive purchased this from ebay, and just recieved it. As far as getting the firmware updates, im kind of at a loss. The netapp site doesnt seem to offer it to non supported non enterprise users. unless im mistaken.

The HBA I have is external only, with 2x 8088 SAS connectors, so Im unable to test hooking up a drive directly to the HBA

these drives I am using for testing are SAS drives

Could be firmware issue on HBA? (I know there is firmware on them, not sure if it could cause an issue or not I know Freenas made me update mine once, but havent had any issues with Larger drives)

Assuming you made an account there already?

Good thought, but im currently on the P20 IT mode firmware, which I believe is recommended.

Pretty sure thats new enough

Id like to pursue updating the disk shelf firmware, but I dont know where to begin.

make an account on netapp support site? Look around on there once logged in, they might offer downloads or allow you to register the product to gain access to the firmware of something. I dont have any exp with netapp, but a few companies lock that stuff down a bit.

Same here… I know if you had an HPE shelf, you’d need an active support agreement for “anything” with HPE, they don’t check specific entitlements just in general.

This wouldn’t be pretty, but you should be able to directly connect a drive to the controller for a brief test:

I wonder… is your 1T drive single port, and the 6T drives dual-port? Maybe they’re upset you only have one IOM connected to the host?

Thanks for the suggestion… the more reading I do the more I am being led down the path of replacing the iom6 controller with a generic like the HB-SBB2-E601-COMP 0952913-07, should maybe make it easier to upgrade firmware? not sure

do you have enough cables to connect both IOM to the host?

no I dont…

I went ahead and moved the LSI card to a linux machine to see if its some sort of BSD thing… but got the same result.

2.146395] sd 0:0:23:1: [sdd] Read Capacity(10) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE

[ 2.146398] sd 0:0:23:1: [sdd] Sense Key : Aborted Command [current] [descriptor]
[ 2.146400] sd 0:0:23:1: [sdd] Add. Sense: Internal target failure
[ 2.146523] sd 0:0:23:1: [sdd] 0 512-byte logical blocks: (0 B/0 B)
[ 2.146531] sd 0:0:23:1: [sdd] 0-byte physical blocks
[ 2.146618] sd 0:0:23:1: [sdd] Write Protect is off
[ 2.146620] sd 0:0:23:1: [sdd] Mode Sense: bf 00 00 08
[ 2.146661] sd 0:0:22:2: [sdc] Unit Not Ready
[ 2.146663] sd 0:0:22:2: [sdc] Sense Key : Aborted Command [current] [descriptor]
[ 2.146666] sd 0:0:22:2: [sdc] Add. Sense: Internal target failure
[ 2.146796] sd 0:0:23:1: [sdd] Write cache: disabled, read cache: enabled, doesn’t support DPO or FUA
[ 2.146859] sd 0:0:22:2: [sdc] Read Capacity(16) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 2.146861] sd 0:0:22:2: [sdc] Sense Key : Aborted Command [current] [descriptor]
[ 2.146863] sd 0:0:22:2: [sdc] Add. Sense: Internal target failure
[ 2.147064] sd 0:0:22:2: [sdc] Read Capacity(10) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 2.147066] sd 0:0:22:2: [sdc] Sense Key : Aborted Command [current] [descriptor]
[ 2.147069] sd 0:0:22:2: [sdc] Add. Sense: Internal target failure
[ 2.147309] sd 0:0:23:1: [sdd] Unit Not Ready
[ 2.147311] sd 0:0:23:1: [sdd] Sense Key : Aborted Command [current] [descriptor]
[ 2.147314] sd 0:0:23:1: [sdd] Add. Sense: Internal target failure
[ 2.147511] sd 0:0:23:1: [sdd] Read Capacity(16) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 2.147513] sd 0:0:23:1: [sdd] Sense Key : Aborted Command [current] [descriptor]
[ 2.147515] sd 0:0:23:1: [sdd] Add. Sense: Internal target failure
[ 2.147583] sd 0:0:22:2: [sdc] Attached SCSI disk
[ 2.147709] sd 0:0:23:1: [sdd] Read Capacity(10) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 2.147711] sd 0:0:23:1: [sdd] Sense Key : Aborted Command [current] [descriptor]
[ 2.147714] sd 0:0:23:1: [sdd] Add. Sense: Internal target failure
[ 2.148180] sd 0:0:23:1: [sdd] Attached SCSI disk

On the other forum, you indicated the drives work fine on a different controller. Has anything changed since then?

Actually I was mistaken, I have tried the drive on another backplane with an adaptec raid card and im getting the same results

[ 2.139614] sd 1:0:23:1: [sdd] 0 512-byte logical blocks: (0 B/0 B)
[ 2.139616] sd 1:0:23:1: [sdd] 0-byte physical blocks
[ 2.139705] sd 1:0:23:1: [sdd] Write Protect is off
[ 2.139707] sd 1:0:23:1: [sdd] Mode Sense: bf 00 00 08
[ 2.139868] sd 1:0:23:1: [sdd] Write cache: disabled, read cache: enabled, doesn’t support DPO or FUA
[ 2.140043] sd 1:0:22:2: [sdc] Unit Not Ready
[ 2.140046] sd 1:0:22:2: [sdc] Sense Key : Aborted Command [current] [descriptor]
[ 2.140048] sd 1:0:22:2: [sdc] Add. Sense: Internal target failure
[ 2.140245] sd 1:0:22:2: [sdc] Read Capacity(16) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 2.140247] sd 1:0:22:2: [sdc] Sense Key : Aborted Command [current] [descriptor]
[ 2.140249] sd 1:0:22:2: [sdc] Add. Sense: Internal target failure
[ 2.140458] sd 1:0:22:2: [sdc] Read Capacity(10) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 2.140461] sd 1:0:22:2: [sdc] Sense Key : Aborted Command [current] [descriptor]
[ 2.140464] sd 1:0:22:2: [sdc] Add. Sense: Internal target failure
[ 2.140593] sd 1:0:23:1: [sdd] Unit Not Ready
[ 2.140595] sd 1:0:23:1: [sdd] Sense Key : Aborted Command [current] [descriptor]
[ 2.140598] sd 1:0:23:1: [sdd] Add. Sense: Internal target failure
[ 2.140791] sd 1:0:23:1: [sdd] Read Capacity(16) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 2.140793] sd 1:0:23:1: [sdd] Sense Key : Aborted Command [current] [descriptor]
[ 2.140795] sd 1:0:23:1: [sdd] Add. Sense: Internal target failure
[ 2.140987] sd 1:0:23:1: [sdd] Read Capacity(10) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 2.140989] sd 1:0:23:1: [sdd] Sense Key : Aborted Command [current] [descriptor]
[ 2.140991] sd 1:0:23:1: [sdd] Add. Sense: Internal target failure
[ 2.141172] sd 1:0:22:2: [sdc] Attached SCSI disk
[ 2.141528] sd 1:0:23:1: [sdd] Attached SCSI disk

is it possible that running badblocks -wsv -b 4096 on these disks caused something to go haywire on reading their size?

Are these old netapp appliance drives by chance? Im currently converting the last of mine to a normal sector size and got the same errors in my log on bootup. If so I used a guide on serve the home " How to reformat HDD & SSD to 512B Sector Size"

tldr
install sg-utils
sg_scan -i
sg_format --format --size=512 /dev/sgX <–drive number
edit: ps this worked for me but but I dont know jack(afayk) and im using an ebay sas cable no shelf(yet)

I’ve seen errors like this (not quite the same) when the controllers are configured for multipath but the drives are missing interposers. Do you have any days drives to try? If sas works but not SATA the shelf expects multipath and won’t ok the disk till it can shake hands properly

Thanks to both of you for getting back to me, Ive done a bit more troubleshooting in the past couple of days. I feel like the problem is that these disks are not spinning up in this disk shelf.

Wendell: The drive that is working is an (older) WD SAS 1tb,
The Drives that are not working (in this shelf) are Seagate ST6000NM0095’s (Exos 6tb)

I just put in a SATA disk to try your suggestion, and it come up fine.

[264785.375599] scsi 1:0:29:0: Processor         MARVELL  LUIGI_V2_STSB_DB 2015 PQ: 0 ANSI: 5
[264785.375608] scsi 1:0:29:0: SSP: handle(0x001d), sas_addr(0x500a0980024b2db5), phy(28), device_name(0x0000000000000000)
[264785.375611] scsi 1:0:29:0: enclosure logical id (0x500a0980024cfdc8), slot(20) 
[264785.375684] scsi 1:0:29:0: Power-on or device reset occurred
[264785.376365] scsi 1:0:29:0: Attached scsi generic sg33 type 3
[264785.404907] scsi 1:0:29:1: Direct-Access     HITACHI  HUS724040ALE64DB NA01 PQ: 0 ANSI: 5
[264785.404914] scsi 1:0:29:1: SSP: handle(0x001d), sas_addr(0x500a0980024b2db5), phy(28), device_name(0x0000000000000000)
[264785.404916] scsi 1:0:29:1: enclosure logical id (0x500a0980024cfdc8), slot(20) 
[264785.405010] scsi 1:0:29:1: Power-on or device reset occurred
[264785.405694] sd 1:0:29:1: Attached scsi generic sg34 type 0
[264785.405738] sd 1:0:29:1: [sde] Spinning up disk...
[264786.436204] .
[264787.460267] .
[264788.484276] .
[264789.508238] .
[264789.508326] sd 1:0:29:1: Inquiry data has changed
[264789.508580] ready
[264789.508751] sd 1:0:29:1: [sde] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
[264789.508753] sd 1:0:29:1: [sde] 4096-byte physical blocks
[264789.508773] sd 1:0:29:1: [sde] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
[264789.508775] sd 1:0:29:1: [sde] 4096-byte physical blocks
[264789.508832] sd 1:0:29:1: [sde] Write Protect is off
[264789.508834] sd 1:0:29:1: [sde] Mode Sense: bf 00 00 08
[264789.508883] sd 1:0:29:1: [sde] Write Protect is off
[264789.508885] sd 1:0:29:1: [sde] Mode Sense: bf 00 00 08
[264789.509028] sd 1:0:29:1: [sde] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[264789.509075] sd 1:0:29:1: [sde] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[264789.509285] sde: detected capacity change from 0 to 1000204886016
[264789.588154]  sde:
[264789.590116] sd 1:0:29:1: [sde] Attached SCSI disk

I have posted a new thread as to not confuse the troubleshooting direction:

Do those drives need the same power pin fix that the white label drives people shuck from the WD Elements externals?

I have tried taping off pins 1-3 with a thin strip of Kapton tape, but that has not helped to get the drives spinning. see below

[332939.859646] sd 1:0:32:1: [sdf] Spinning up disk...
[332940.864363] .
[332941.888373] .
[332942.912384] .
[332942.912692] not responding...
[332942.913003] sd 1:0:32:1: [sdf] Read Capacity(16) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE