# Accidentally added a disk to a ZFS RaidZ pool instead of replacing failing one



## Green_Bandit (Apr 8, 2011)

As title says, I somehow ended up adding a disk to my ZFS RaidZ pool, that was meant to replace a failed disk I had gotten in return from RMA as a replacement for afromented dead disk.
Is it possible to detach a disk from a RaidZ pool without there being a "spare parity disk" to fall back on?
Or do I need to backup all my data, destroy the pool and recreate it with the correct parameters??
I don't have any spare disk(s) handy for copying all the data from the RaidZ pool onto, so I really can't do the latter option.


----------



## SirDice (Apr 8, 2011)

Green_Bandit said:
			
		

> Is it possible to detach a disk from a RaidZ pool without there being a "spare parity disk" to fall back on?


With RAIDZ the parity is spread around all disks.

Can you post a `# zpool list` 

If I'm not mistaken you can't add just 1 disk to an existing raidz.


----------



## phoenix (Apr 8, 2011)

Green_Bandit said:
			
		

> As title says, I somehow ended up adding a disk to my ZFS RaidZ pool, that was meant to replace a failed disk I had gotten in return from RMA as a replacement for afromented dead disk.
> Is it possible to detach a disk from a RaidZ pool without there being a "spare parity disk" to fall back on?
> Or do I need to backup all my data, destroy the pool and recreate it with the correct parameters??
> I don't have any spare disk(s) handy for copying all the data from the RaidZ pool onto, so I really can't do the latter option.



If you look at the output of `$ zpool status` and you see a single disk listed at the same level as the raidz vdev, then you are now running a non-redundant pool with 2 vdevs (1 raidz, 1 single disk).

There are two ways to fix this, depending on your desired outcome.

If you just want to keep the pool alive and safe, then you can *attach* another drive to the single disk, turning it into a mirror vdev.  Then you will have a pool with 2 vdevs:  1 raidz, 1 mirror.  And your pool will be redundant and your data will be safe.  It's not ideal, as you will have unbalanced vdevs.  But it works (this is the setup I have at home, 3-SATA raidz1 + 2-IDE mirror).  `# zpool attach poolname olddisk newdisk`

If you want your pool to be just raidz vdevs, then you will need to backup your data, destroy your pool, create a new pool, and restore your data.

There is no way to remove a top-level vdev from a pool.


----------



## phoenix (Apr 8, 2011)

SirDice said:
			
		

> With RAIDZ the parity is spread around all disks.
> 
> Can you post a `# zpool list`
> 
> If I'm not mistaken you can't add just 1 disk to an existing raidz.



`# zpool add poolname ada10`
Ooops, you've just added a new vdev to *poolname*, made up of a single, non-redundant, standalone disk.    The pool is now striped across all the existing vdevs + the single disk.  If that single disk dies ... all the data in the whole pool is gone.


----------



## Green_Bandit (Apr 9, 2011)

```
root@SERVER:~$zfs list
NAME    USED  AVAIL  REFER  MOUNTPOINT
array  1,02T  4,32T  1,02T  /array
```


```
root@SERVER:~$zpool status
  pool: array
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
	the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: [url]http://www.sun.com/msg/ZFS-8000-2Q[/url]
 scrub: none requested
config:

	NAME                                               STATE     READ WRITE CKSUM
	array                                              DEGRADED     0     0     0
	  raidz1-0                                         DEGRADED     0     0     0
	    disk/by-id/ata-SAMSUNG_HD154UI_S1XWJDWZ212027  UNAVAIL      0     0     0  cannot open
	    disk/by-id/ata-SAMSUNG_HD154UI_S1XWJDWZ212022  ONLINE       0     0     0
	    disk/by-id/ata-SAMSUNG_HD154UI_S1XWJDWZ212013  ONLINE       0     0     0
	    disk/by-id/ata-SAMSUNG_HD154UI_S1XWJDWZ212023  ONLINE       0     0     0
	    disk/by-id/ata-SAMSUNG_HD154UI_S1XWJ90ZC14975  ONLINE       0     0     0

errors: No known data errors
root@SERVER:~$
```
Knew I should've posted that in the OP, but I was in a hurry, and that's also probably half the reason to I screwed up.

The disk with the id _ata-SAMSUNG_HD154UI_S1XWJDWZ212027_ is the one that one day refused to start up during boot, so the array never detected it as failing just missing.

While it's the disk with the id _ata-SAMSUNG_HD154UI_S1XWJ90ZC14975_ that was meant to replace it, that I accidentally added to the pool instead.

And I don't have any other disk(s) to back-up the data do

And since _ata-SAMSUNG_HD154UI_S1XWJDWZ212027_ is dead and gone, then there's no redundancy so this is a iffy situation I've created.


----------



## t1066 (Apr 11, 2011)

Would you post

`# zpool history`

This should show us what you had actually done.


----------



## jem (Apr 14, 2011)

As a slight aside, how do you expose your disk ID's under /dev/disk/by-id/?


----------



## Green_Bandit (Apr 16, 2011)

t1066 said:
			
		

> Would you post
> 
> `# zpool history`
> 
> This should show us what you had actually done.




```
2011-03-04.18:56:50 zpool import 6906024593045588727
2011-03-05.10:04:40 zpool scrub array
2011-04-08.15:26:33 zpool add -f array disk/by-id/ata-SAMSUNG_HD154UI_S1XWJ90ZC14975
```
Top one is when I migrated the pool from another server to the current one. The scrub is when I first noticed something fishy was going on.
And the bottom one is my slip-up.

The rest is stuff that's over a half year old. Scrubs and so on.


----------



## t1066 (Apr 17, 2011)

In this case, the best way to proceed is as phoenix suggested: backup data, destroy pool, etc.


----------



## Green_Bandit (Apr 21, 2011)

Kind of expected that to be the solution. Bought another drive identical to those in the pool to temporarily backup the data to. Gonna keep it as a spare drive in-case another one of the fails.


----------



## Big_Boss (Sep 29, 2013)

See! After one disk gets bad, your data is still available with two disks. Now you added new disk it means data striped over three disks with RAID-Z. (It will not stripe over four disks because the fourth disk is already 'missing'). So I think you can remove the faulty disk without any issue.

But one doubts that no one has given this solution, which means something must be wrong with my logic.

Can someone explain what is wrong with my logic?


----------



## Crivens (Sep 30, 2013)

Big_Boss said:
			
		

> But one doubts that no one has given this solution, which means something must be wrong with my logic.


<voice of elderly monk> It is. 



> Can someone explain what is wrong with my logic?



The RAID-Z is forming one drive, in your model. That means the OP now only has two drives which stripe the data. One drive is a bit harder to corrupt because it actually is a RAID-Z, which means that one disk in that "drive" can fail without data being destroyed. If the other, single, disk is going to fail, the OP will loose one of the striping devices, which is fatal. In the RAID-Z, two disks would need to fail to destroy a striping device.

ZFS has, at least, two layers of organisation. This can take some effort to get used to, but it is well worth the work.

My storage system at home can loose two disks without data loss, three can fail without loosing important data and four can fail without loosing _really_ important data. Praised be the ZFS copies attribute!


----------



## kpa (Sep 30, 2013)

The `zpool add ...` command creates a RAID 0 (in other words non-redundant) equivalent striped pool or extends an existing striped pool with one vdev. If the vdev happens to be a single disk that vdev will not have any internal redundancy and also means that the whole pool is non-redundant because if the single disk vdev has a problem the whole pool is lost. In ZFS the redundancy is only across a single vdev if present and a vdev can not participate in providing redundancy for any other vdev.


----------



## throAU (Oct 1, 2013)

New feature request:  `zpool add` takes a snapshot that can be rolled back to pre-VDEV addition 

Sucks that you managed to do that so easily, but it does follow the general Unix princple of assuming the system administrator knows what they are doing...

But yes, I'd suggest that the least "painful" (in terms of downtime and work) way out of this would be to add another TWO disks.  One to create a mirror VDEV from the single disk, and another to replace the failed drive in your RAID-Z VDEV properly.  Yes, money... depends how much your time (and your server's uptime) is worth.


----------



## phoenix (Oct 2, 2013)

Your array pool is fine.  You didn't add a single-disk vdev to it. Notice how it only lists five disks underneath the "raidz1-0" heading, and no other disks are listed anywhere.

All you have to do is pull the "UNAVAIL" drive, physically replace it with a new drive, and use `zpool replace`.


----------

