I really should replace this drive in my FreeNAS box

Couple of you here already asked me to replace this drive, but I’m just letting it go “until death do us part” given pool reports always come back healthy… another Multi-zone error logged.

WD Red HDD.

########## SMART status report for ada0 drive (Western Digital Red: WD-WCC7K7KKNRD6) ##########
smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build)

SMART overall-health self-assessment test result: PASSED

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   204   167   021    Pre-fail  Always       -       4766
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       25
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   095   095   000    Old_age   Always       -       4208
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       19
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       14
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       14
194 Temperature_Celsius     0x0022   116   103   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       1

No Errors Logged

Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
Short offline       Completed without error       00%      4180         -

How old is this drive, and what have you put it through?

6 months operation, 30 or so spin ups. Still not a great deal of work for a HDD. Are you going to create a cron job, log all the smart parameters to a file and plot it’s demise after the fact?

2 Likes

buy a referb drive. better safe than sorry, unless it’s the 2nd part of a raid 1 array, then you could test is out a little longer to see whats actually wrong. either its a bad drive or a bad board.

Lifetime has been in the FreeNAS box.

@SgtAwesomesauce craziest thing ever - looked at the serial on the drive closest to the Plexi door… it’s the one with MultiZone Error rates.

Couple days ago it recorded 11… New HDD just arrived off Amazon. Now the question is whether I’d be able to do it without really having a backup anywhere else… eek!

Should I first setup a replication stack, replicate all existing data, THEN attempt resilvering the replaced drive? It’s only Raidz2.

Okay, it’s Z2, so you have 2 failure tolerance. That’s good.

How big are the drives and are you able to accept downtime?

1 Like

4TB each. ~6-12 hrs is tolerable?

I’d just do the resilver.

1 Like

This might be an issue - I’m over the 80% recommended limit


########## ZPool status report summary for all pools ##########

+--------------+--------+------+------+------+----+--------+------+-----+
|Pool Name     |Status  |Read  |Write |Cksum |Used|Scrub   |Scrub |Last |
|              |        |Errors|Errors|Errors|    |Repaired|Errors|Scrub|
|              |        |      |      |      |    |Bytes   |      |Age  |
+--------------+--------+------+------+------+----+--------+------+-----+
|freenas-boot  |ONLINE  |     0|     0|     0|  2%|       0|     0|   20|
|big          ?|ONLINE  |     0|     0|     0| 87%|       0|     0|   16|
+--------------+--------+------+------+------+----+--------+------+-----+

???

It’ll still resilver a new drive fine, but good time to consider relplacing all the drives with larger ones, or to replicate to a different zpool of higher capacity.
Personally I’d keep the array, and just cycle larger drives (remember to set the auto-resize if it’s not already turned on in free as)

1 Like

What do you mean ‘cycle’ larger drives… WAIT is that a feature???

  • Take drive 0 out
  • replace with 10TB // Resilver // wait
  • replace drive 1 with 10TB // resilver // wait/
  • iterate till drive N is replaced.

Right?

1 Like

@SgtAwesomesauce LOL I’m pretty sure you told me this and I missed it!!!

Where is the auto-resize option in V11 dude? ta

Yup, that is how cycling works, replace and resilver one drive at a time.
It’s slow (with 2-4TB drives) but works a charm. Over that, and not sure if the time will be worth it, or replicating better, but for now, that’s the easy, safe way

Basically this.

1 Like

With raidz2 you could technically replace 2 at a time, but you better have good backups :slight_smile:

technically

but with one failure already, I’m going to go ahead and say don’t risk it.

1 Like

Thanks, and yeah, wouldn’t go near it myself…
I can’t find autoexpand=on in the Freenas GUI, I went to the shell, set the property, exported then imported =magic! But I only have version 9

2 Likes

Brilliant thanks bro