Fixing a add space mistake with ZFS

OldMethos · December 29, 2023, 6:53am

Hi All
I made a mistake with a drive add (after a child removed a drive from my array)
The spare started resilvering and I remove drive which was pulled and ran zpool add not zpool add space
Now I can not remove the drive, any ideas how to remove ata-HGST_HUS722T2TALA604_WMC6N0P4R4FR

  pool: rbd
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Dec 29 16:23:11 2023
	8.56T / 18.8T scanned at 16.7G/s, 294G / 18.8T issued at 572M/s
	25.4G resilvered, 1.52% done, 09:27:01 to go
config:

	NAME                                                 STATE     READ WRITE CKSUM
	rbd                                                  ONLINE       0     0     0
	  raidz2-0                                           ONLINE       0     0     0
	    ata-ST2000VN004-2E4164_Z52379Y6                  ONLINE       0     0     0
	    ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M2NVFF1K         ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WCC6N0TET028            ONLINE       0     0     0
	    ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M2NVFT9F         ONLINE       0     0     0
	    ata-ST2000VN004-2E4164_Z5237BAD                  ONLINE       0     0     0
	    wwn-0x50014ee059e5c13e                           ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WMC6N0L7XYAZ            ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WMC6N0M62XTF            ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WMC6N0L5EF5V            ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WMC6N0M0S4S6            ONLINE       0     0     0  (resilvering)
	    wwn-0x50014ee00490a8ff                           ONLINE       0     0     0
	  ata-HGST_HUS722T2TALA604_WMC6N0P4R4FR              ONLINE       0     0     0
	special
	  mirror-4                                           ONLINE       0     0     0
	    nvme-INTEL_SSDPEK1A058GA_PHOC1522001H058A-part2  ONLINE       0     0     0
	    nvme-INTEL_SSDPEK1A058GA_PHOC2092009S058A-part2  ONLINE       0     0     0
	logs
	  mirror-3                                           ONLINE       0     0     0
	    nvme-INTEL_SSDPEK1A058GA_PHOC2092009S058A-part1  ONLINE       0     0     0
	    nvme-INTEL_SSDPEK1A058GA_PHOC1522001H058A-part1  ONLINE       0     0     0

Any ideas?

Exard3k · December 29, 2023, 8:29am

Device removal is only possible if nothing is RAIDZ. You can only remove vdevs if all vdevs are single disks or mirrors.

There is nothing you can do about it. Backup and recreate the pool with the correct configuration of vdevs. This is the procedure most of us use to change vdev arrangements.

having a non-redundant stripe as a vdev not only lacks any self-healing, there also is no redundancy and without that vdev, the pool will fault in its entirety = all dead.

edit: alternatively, you can add redundancy to the vdev. Like a 2 or 3-way mirror by attaching additional drives and the pool will be fine for all intents and purposes. We expand pools by adding vdevs after all, but usually while keeping the same redundancy to not have weak links.

OldMethos · December 29, 2023, 8:54am

So I accidentally miss one word and its not fixable?
That’s just insane.

If I have to recreate i’ll go with CEPH.

I had two spares a drive failed a few weeks ago and I’m waiting on replacements

Trooper_ish · December 29, 2023, 10:43am

Also, could be wrong, but a typo, as you probably wanted to add a “spare” instead of “space”

Presumably that was an auto correct typo, as the system would error, saying “no pool named space” into which it could add a drive.

And yeah, it’s brutal, but no way to remove a vdev from a raid, and now writes are gonna be placed on the odd disk, so the longer it’s used, the more data will be at risk

~~Presumably, offlining the disk (which is a whole vdev) will kill the pool?~~ making the drive to a mirror was a great idea, for now

DavieDavieDavie · December 29, 2023, 10:57am

I know it’s too late now, but this is where zpool checkpoint comes into play. Before adding any drives to a pool, running this command gives you the ability to roll back should you add a drive incorrectly.
https://openzfs.github.io/openzfs-docs/man/master/8/zpool-checkpoint.8.html

You should be able to remove a non-redundant single top level VDEV, review the documentation at zpool-remove.8 — OpenZFS documentation for details.

H-i-v-e · December 29, 2023, 11:05am

That’s the power and responsibility that comes with being root.

The problem is that the single drive you added is a vdev on the same level as the raidz2 vdev you had before and ZFS is not keeping a record if any data has already been distributed onto that new vdev and there is currently no logic integrated in ZFS to redistribute any data on a vdev or rebalance vdevs if unevenly written to, so there is simply no way for it to know if it can remove that vdev without destroying data and that's why there is not option to do this except the safe way to manually recreate the pool. But you need to see the positive in this lesson, from now on you will make sure the command is correct or you can reverse what you do before you do any big changes. It's a lesson all of us have to learn at some point.

OldMethos · December 29, 2023, 9:20pm

Thanks for this information

OldMethos · December 29, 2023, 9:21pm

Will this, keep me safe until I can afford to buy a large enough set of drives to copy the data too?

root@pve:~# zpool status
  pool: rbd
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Dec 29 16:23:11 2023
	18.8T / 18.8T scanned, 17.4T / 18.8T issued at 291M/s
	1.51T resilvered, 92.48% done, 01:24:47 to go
config:

	NAME                                                 STATE     READ WRITE CKSUM
	rbd                                                  ONLINE       0     0     0
	  raidz2-0                                           ONLINE       0     0     0
	    ata-ST2000VN004-2E4164_Z52379Y6                  ONLINE       0     0     0
	    ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M2NVFF1K         ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WCC6N0TET028            ONLINE       0     0     0
	    ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M2NVFT9F         ONLINE       0     0     0
	    ata-ST2000VN004-2E4164_Z5237BAD                  ONLINE       0     0     0
	    wwn-0x50014ee059e5c13e                           ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WMC6N0L7XYAZ            ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WMC6N0M62XTF            ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WMC6N0L5EF5V            ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WMC6N0M0S4S6            ONLINE       0     0     0  (resilvering)
	    wwn-0x50014ee00490a8ff                           ONLINE       0     0     0
	  mirror-5                                           ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WMC6N0P4R4FR            ONLINE       0     0     0
	    ata-HGST_HUS722T2TALA604_WCC6N1KH4TCJ            ONLINE       0     0     0  (awaiting resilver)
	special
	  mirror-4                                           ONLINE       0     0     0
	    nvme-INTEL_SSDPEK1A058GA_PHOC1522001H058A-part2  ONLINE       0     0     0
	    nvme-INTEL_SSDPEK1A058GA_PHOC2092009S058A-part2  ONLINE       0     0     0
	logs
	  mirror-3                                           ONLINE       0     0     0
	    nvme-INTEL_SSDPEK1A058GA_PHOC2092009S058A-part1  ONLINE       0     0     0
	    nvme-INTEL_SSDPEK1A058GA_PHOC1522001H058A-part1  ONLINE       0     0     0
	spares
	  ata-HGST_HUS722T2TALA604_WCC6N6CNPTCJ              AVAIL

errors: No known data errors