# Creating a consistent ZFS backup



## olav (Jun 3, 2010)

Is this possible?

I have a file storage where there is always some writing to the disk. To create a consistent backup I would need to deny new writes and wait for the current writes to finish before taking a snapshot. Are there some tools that can help with this?


----------



## Alt (Jun 3, 2010)

I think before this, application which write to disk must save consistent data before backup (flush buffers etc). Then you can backup/snapshot it.
So i think its application-level question, not directly related to zfs


----------



## graudeejs (Jun 3, 2010)

```
zfs snapshot pool/test@date
zfs send pool/test@date > /path/to/backup.zfs
```

works fine for me


read zfs(1) for details


----------



## phoenix (Jun 3, 2010)

Just snapshot the ZFS filesystem:
`# zfs snapshot poolname/filesystem@snapname`

Then tell your backup software to use the snapshot:
`# rsync /path/to/filesystem/.zfs/snapshot/snapname/`

Or, if backing up to a ZFS filesystem on a remote server, use the ZFS send feature:
`# zfs send poolname/filesystem@snapname | ssh user@remote "zfs recv"`

Or, if you just want a backup file that can be stored on disc, tape, etc:
`# zfs send poolname/filesystem@snapname > /path/to/archive/filename.whatever`

"zfs snapshot" will flush all buffers to disk, to give you a consistent filesystem.  Just be sure to use the snapshot for the backups, and not the live filesystem.


----------



## olav (Jun 3, 2010)

But it doesnt really answer the real question? What if I run a database and it has some transactions running? What if I have a VM image.

In the real world I would have to lock all tables in a database and drop all transactions. For the VM image, I would have to send a command to the guest OS to flush all data to disk and then "pause" until snapshot is complete.

I might have to use something else when it comes to backup of databases and VM images?


----------



## tingo (Jun 3, 2010)

In the real world, the simple solution is:
- stop the database, do a snapshot, restart the database
the same solution works for VMs.


----------



## phoenix (Jun 3, 2010)

olav said:
			
		

> But it doesnt really answer the real question? What if I run a database and it has some transactions running?



So you add an extra line to the backup script that quiesces the database before calling "zfs snapshot".



> What if I have a VM image.



You add a line to the backup script that takes a snapshot of the VM before taking the ZFS snapshot.  If you need to restore, you just start up from the VM snapshot.



> In the real world I would have to lock all tables in a database and drop all transactions.



How is this different from *ANY* backup solution?



> For the VM image, I would have to send a command to the guest OS to flush all data to disk and then "pause" until snapshot is complete.



Again, how is this different any other backup solution?



> I might have to use something else when it comes to backup of databases and VM images?



We backup several dozen MySQL database servers using nothing more than mysqldump, and rsync (of the dump and live database files).  Everytime we've had to restore from backups, we restore the database files first.  If that doesn't allow it to start, then we import the dump file.  No lost data so far.

For our VMs, we just back them up like normal, physical servers (ie, from inside the VM).  Restoration is a simple "create VM, boot VM, restore data" process, no different from a physical server.


----------



## carlton_draught (Jun 4, 2010)

olav said:
			
		

> Is this possible?
> 
> I have a file storage where there is always some writing to the disk. To create a consistent backup I would need to deny new writes and wait for the current writes to finish before taking a snapshot. Are there some tools that can help with this?


I'm not sure how many ZFS filesystems you have, but I suggest doing a recursive snapshot of all filesystems containing data you need to back up, at whatever level will include them all (quite possibly pool). This makes the snapshots atomic, i.e. all of the snapshots you are taking with that one command are going to correspond to one exact point in time. e.g.

`# zfs snapshot -r poolname`

But yeah, you can't blame ZFS for the fault of an RDBMS, stopping database writes is not a filesystem specific problem, and you should learn what best practice is for whatever application you are running. But if you mentioned which database it was perhaps people here would have a suggestion. For PostgreSQL a pg_dumpall writes your database cluster to a script which you can use to restore it. If you want a painless way of not having to restore I guess you could use zfs snapshots and stop/restart it like tingo said.


----------

