# zfs/zpool offline problems



## c_geier (Aug 24, 2010)

Hi,
I'm having problems replacing a WD10EARS (hasn't failed yet) in my raidz1.

The pool seems to be fine:

```
# zpool status                                                                                                              ~
  pool: storage
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            ad0     ONLINE       0     0     0
            ad2     ONLINE       0     0     0
            ad4     ONLINE       0     0     0
            ad6     ONLINE       0     0     0

errors: No known data errors
```

but if I now try to "zpool offline" a drive zpool returns an error message:

```
# zpool offline storage ad2
cannot offline ad2: no valid replicas
```

Isn't this the way to replace a disk in a radz1?

I'm using 8.0-RELEASE-p2

Please help if you can. Thanks!


----------



## graudeejs (Aug 24, 2010)

I don't see any spare disks, that would replace it in your setup 
Your raid will work without ad2..... 

What I understand is that you want some other disk to replace ad2, right?
For this you need 1 unused disk. You need to add this disk with `# zpool add spare ad8` for example.

Or rebuild your raid with 3 disks, and use forth as spare


----------



## c_geier (Aug 24, 2010)

The spare disk is still on my desk. 

Since no disk has faulted, I was under the impression that I should offline it before shutting down the machine, replacing the disk with a newdisk, rebooting and then restoring.


----------



## graudeejs (Aug 24, 2010)

that depends how you want to do it 
You can have it in pc, and make it able to replace faulty disk automatically, without need of shutdown 
If it's a server, the I would add it to zpool as spare.
If it's a home PC, i would keep it in a box


----------



## Matty (Aug 24, 2010)

Done it on my pool just for fun and didn't had any problems. I'm running raid10 zfs setup tho.

It should be possible to offline 1 disk in a raidz1 pool and keep it running and online it after few sec to see what the resilvering looks like.


----------



## Matty (Aug 24, 2010)

c_geier said:
			
		

> The spare disk is still on my desk.
> 
> Since no disk has faulted, I was under the impression that I should offline it before shutting down the machine, replacing the disk with a newdisk, rebooting and then restoring.



That would be the normal procedure with non-hotswap disks.


----------



## phoenix (Aug 24, 2010)

killasmurf86 said:
			
		

> I don't see any spare disks, that would replace it in your setup
> Your raid will work without ad2.....
> 
> What I understand is that you want some other disk to replace ad2, right?
> ...



No, no, no, no, and no.  

You do not *NEED* a spare vdev in a pool in order to replace a drive in the pool.

The process for doing so is:

zpool offline poolname devicename
stop/detach the disk using controller methods or ata/ahci/camcontrol
physically remove the disk
insert new disk
label/partition if needed
zpool replace poolname devicename (if device names are different add olddevicename)

Process works beautifully on FreeBSD 7.x and 8.x using ZFSv6, ZFSv13, and ZFSv14 (those are the versions I've done this on).  Works for replacing dead drives, works for replacing good drives with larger ones.  IOW, it just works.

No spare vdevs required.


----------



## phoenix (Aug 24, 2010)

c_geier said:
			
		

> Since no disk has faulted, I was under the impression that I should offline it before shutting down the machine, replacing the disk with a newdisk, rebooting and then restoring.



That is the best method for replacing a "good" disk.

You can also do it by powering off the box, physically replacing the drive, booting, and using "zpool replace" to replace the FAULTED drive.  However, that can lead to problems where you end up with a disk that is not replacable and the new drive won't finish resilvering and you run the risk of losing the entire pool.  Very stressful situation.  

Better to do it using the "zpool offline" method.


----------



## phoenix (Aug 24, 2010)

Matty said:
			
		

> That would be the normal procedure with non-hotswap disks.



That would be the normal procedure, period.    Doesn't matter if the disks are hot-swappable or not.


----------



## c_geier (Aug 24, 2010)

phoenix said:
			
		

> Better to do it using the "zpool offline" method.




So why is zpool under the impression that it cannot take a disk offline?


----------



## graudeejs (Aug 24, 2010)

Thanks for filling my knowledge gaps


----------



## Matty (Aug 24, 2010)

So we all agree on the right methode but why can't TS offline the disk ?  3 out of 4 disk should be enough to keep the pool going?

@TS: HAve you tried to scrub the pool first?


----------



## phoenix (Aug 24, 2010)

As a test, power off the machine, disconnect 1 drive, and boot.  See if the pool comes up or not.  If it does, then try to offline the FAULTED drive.  If there are no errors, then power off, replace the drive, and boot, and see if you can do the zpool replace.

Note:  do not format the old drive, as you may need to boot with it attached if things go awry with the replace.


----------



## c_geier (Aug 25, 2010)

@Matty: No, I didn't scrub first :r, but that did the trick, I could offline the disk after scrubbing it first and the pool is now resilvering. Thanks!

But strangely the scrub was finished in 0h0m and did not report any errors: 

```
# zpool status storage
 pool: storage
 state: ONLINE
 scrub: scrub completed after 0h0m with 0 errors on Wed Aug 25 18:49:13 2010
config:

NAME        STATE     READ WRITE CKSUM
storage     ONLINE       0     0     0
  raidz1    ONLINE       0     0     0
    ad0     ONLINE       0     0     0
    ad2     ONLINE       0     0     0
    ad4     ONLINE       0     0     0
    ad6     ONLINE       0     0     0

errors: No known data errors
```

While I don't completely understand this I won't complain since it's working now.  Thanks everybody!


----------



## c_geier (Aug 26, 2010)

phoenix said:
			
		

> The process for doing so is:
> 
> zpool offline poolname devicename
> stop/detach the disk using controller methods or ata/ahci/camcontrol
> ...



actually I had to 
	
	



```
zpool online poolname devicename
```
 before the pool's status changed to ONLINE again. Is this normal behaviour?


----------



## phoenix (Aug 27, 2010)

Oops, yes, you need to online a disk that has been offlined.


----------

