Hi folks. I logged into my freeNAS server today via the web interface and I was greeted with this message:
“Device: /dev/ada0, 16 Currently unreadable (pending) sectors”
It seems like that one of my hard drives (ada0) needs to be replaced. The question I have is, how do I determine which physical drive in my server is the malfunctioning one? I have 10 WD Red 6TB drives (RAID Z2) in there and I would like to find out if there is an easy way to determine which one is the faulty drive.
I’m not using Hot swapable bays. It’s a regular ATX case. That gives me an idea though. I disabled the drive, gonna wait a few minutes for it to cool down. My hand should be able to tell which drive is cooler to the touch.
Yeah, thanks guys, I found the drive and made sure to write down the name, number, and location of every drive in case something similar happens in the future. Turns out I had HGST drives. 10 x 6TB. it’s been 4 years since I built this NAS and my last NAS had WD Reds and I forgot these were HGSTs. Anyways, Warranty is out of the question. Standard is 3 years and this drive was bought 4 years ago.
Just a point of principle, that error message doesnt mean the drive is dead yet, just that it has some duff sectors. 16 is not enough to pull the disk and a scrub should deal with it, bit keep an eye on the drive. True it is a sign that it is on its way out, but you can live with it for a while if you have RAIDZ2 and hotspare.
You can migrate any “prevail” disks down to a lower priority array like the CCTV or secondary backup and keep them going a bit longer.
For future reference the error message for device unreadable Is that the pool will say “Degraded” and tell you which drive is out of action. Guidance here:
+1. This is no longer in the active maintenance train. Backup and upgrade to 11.3. it is stable.
Yes, it is a SMART error and not a ZFS error. I know that this drive might be fine for years to come and I’m not gonna throw it away. I’ll just use it as a tertiary backup. I won’t take the chance of it screwing with my zpool and that’s why I’m replacing it. It could die tomorrow or last another 20 years, but why risk it.
That is really not enough. Is it very hot in there? Lots of vibrations? Brown-outs? Or do you have power managment enabled so HDDs power down when not used?
Nevertheless: contact HGST with that screenshot. If they’re in a good mood, they might actually replace it.
Even stranger. I mean; shit happens, but this is way too low…
Might be a Model that was not ported to the new DB when WD faded out the brand. Still: Try it. Heard some good things about their handling of stuff like that. Maybe you at least get a discount. Worth a mail, don’t you think?
Interesting development. I used CCleaner’s Drive Wiper tool and did a advanced overwrite (3 passes). Took about 30 hours to finish, but after that, the smart error is gone.