Hard Drive issue on FreeNAS 9.10.2-U6

Hi folks. I logged into my freeNAS server today via the web interface and I was greeted with this message:
“Device: /dev/ada0, 16 Currently unreadable (pending) sectors”

It seems like that one of my hard drives (ada0) needs to be replaced. The question I have is, how do I determine which physical drive in my server is the malfunctioning one? I have 10 WD Red 6TB drives (RAID Z2) in there and I would like to find out if there is an easy way to determine which one is the faulty drive.

Sure, thats easy. You disable that drive in freenas and the little light on your hot swap bays stops blinki…

Oh… let me guess. :grimacing:

I’m not using Hot swapable bays. It’s a regular ATX case. That gives me an idea though. I disabled the drive, gonna wait a few minutes for it to cool down. My hand should be able to tell which drive is cooler to the touch.

1 Like

FreeNAS 9.10 is very old so I don’t know how one would do this in the GUI, but diskinfo -s ada0 will tell you the serial number of the drive.

4 Likes

Add then if he powers off the system he can just check each drive’s s/n!

Yeah, thanks guys, I found the drive and made sure to write down the name, number, and location of every drive in case something similar happens in the future. Turns out I had HGST drives. 10 x 6TB. it’s been 4 years since I built this NAS and my last NAS had WD Reds and I forgot these were HGSTs. Anyways, Warranty is out of the question. Standard is 3 years and this drive was bought 4 years ago.

Advertised at 1 million hours MTBF.
34K on my clock, almost a million.

Just a point of principle, that error message doesnt mean the drive is dead yet, just that it has some duff sectors. 16 is not enough to pull the disk and a scrub should deal with it, bit keep an eye on the drive. True it is a sign that it is on its way out, but you can live with it for a while if you have RAIDZ2 and hotspare.

You can migrate any “prevail” disks down to a lower priority array like the CCTV or secondary backup and keep them going a bit longer.

For future reference the error message for device unreadable Is that the pool will say “Degraded” and tell you which drive is out of action. Guidance here:

+1. This is no longer in the active maintenance train. Backup and upgrade to 11.3. it is stable.

Can OP export the pool, then re-import using /diskidfor more meaningful name?

I was thinking of the Linux way of using /dev/disk/by-id but the internet’s says maybe?

I’ve had a similar situation and all it was is just a bad SATA cable, easy fix.

The error is coming from SMART not ZFS as I understand it.

That is a ZFS error message, standard alert solved with regular pool scrubs or drive replacement

Smart errors are a little more dramatic.

Edit: this is not accurate - ignore

I just checked, that’s definitely an error message in smartd:

1 Like

Yes, it is a SMART error and not a ZFS error. I know that this drive might be fine for years to come and I’m not gonna throw it away. I’ll just use it as a tertiary backup. I won’t take the chance of it screwing with my zpool and that’s why I’m replacing it. It could die tomorrow or last another 20 years, but why risk it.

1 Like

That is really not enough. Is it very hot in there? Lots of vibrations? Brown-outs? Or do you have power managment enabled so HDDs power down when not used?

Nevertheless: contact HGST with that screenshot. If they’re in a good mood, they might actually replace it.

1 Like

It’s a well ventilated case with rubber mounts for the HDD trays. The system is connected to a UPS so it’s well protected all around.

The product isn’t even listed on their website:

https://www.westerndigital.com/support.hgst.data-center-drives

Mine is a “HGST Deskstar NAS H3IKNAS600012872SN (0S03839)”

Even stranger. I mean; shit happens, but this is way too low…

Might be a Model that was not ported to the new DB when WD faded out the brand. Still: Try it. Heard some good things about their handling of stuff like that. Maybe you at least get a discount. Worth a mail, don’t you think?

Couldn’t hurt I suppose. I shot them an e-mail.

Interesting development. I used CCleaner’s Drive Wiper tool and did a advanced overwrite (3 passes). Took about 30 hours to finish, but after that, the smart error is gone.

Is that normal? I wonder if FreeNAS’s scrub function would have done the same.