# ZFS diff script



## kungfujesus (Feb 1, 2010)

Thus far I have been trying to successfully develop a script which does the following when comparing a ZFS volume to a snapshot:
   Gives the new files (missing from snapshot)
   Gives the missing files (in snapshot but not current volume)
   Gives the potentially changed files

I've done it to no avail.  Several times over I've been tweaking this to get it to work, first eliminating most non-printable characters for the comparison (this may be creating duplicate entries when fed through comm with both columns 1 and 2).  The problem with that is those non-printable characters evidently make a huge difference to ls -s.  Perhaps somebody could help me with this script:


```
#!/usr/local/bin/bash

LC_ALL=C;
export LC_ALL=C;
LANG=en_US.UTF-8;
export LANG=en_US.UTF-8;

sudo mount -t mfs -o rw,noatime,-s8M md /mnt/rd;
touch /mnt/rd/1 /mnt/rd/2;
#cd $1 && find . -type f | tr -d \000 | sort -o /mnt/rd/1;
cd $1 && find . -type f | tr -d \000 | sort -o /mnt/rd/1;
#cd $1/.zfs/snapshot/$2 && find . -type f | tr -d \000 | sort -o /mnt/rd/2;
cd $1/.zfs/snapshot/$2 && find . -type f | sort -o /mnt/rd/2;

echo -ne "New files: \n" >> /mnt/rd/nmf;
comm -23i /mnt/rd/1 /mnt/rd/2 >> /mnt/rd/nmf;
echo -ne "Missing files: \n" >> /mnt/rd/nmf;
comm -13i /mnt/rd/1 /mnt/rd/2 >> /mnt/rd/nmf;
echo -ne "Most likely changed files: \n" >> /mnt/rd/nmf;
comm -12i /mnt/rd/1 /mnt/rd/2 >> /mnt/rd/same;

#do ls -s (same output as du) and get the file sizes
while read LINE; do
        cd $1 && echo -ne "size: `ls -s "$LINE"`\n"
        cd $1/.zfs/snapshot/$2 && echo -ne "size: `ls -s "$LINE"`\n"
done < /mnt/rd/same >> /mnt/rd/3;
#this is for the current

#do ls -s (same output as du) and get the file sizes
#cd $1/.zfs/snapshot/$2 && while read LINE; do
#       ls -s $1/.zfs/snapshot/$2/"$LINE"
#done < /mnt/rd/same >> /mnt/rd/4;
#this is for the snapshot

uniq -ui /mnt/rd/3 /mnt/rd/uniqs
cat /mnt/rd/uniqs >> /mnt/rd/nmf; #find unmatched sizes
cat /mnt/rd/nmf;
cd ~;
sudo umount /mnt/rd;
MDS=`sudo mdconfig -l`;

for i in $MDS;
do sudo mdconfig -d -u $i;
done;
```


----------



## kungfujesus (Feb 1, 2010)

Ahh actually I found a problem: I kept the tr -d \000 line in one of those pipes while removing it in the other.  Do that and the script seems to mostly work.  My music collection seems to be producing duplicate columns for comm so far, though.


----------



## kungfujesus (Feb 1, 2010)

Does anybody know a faster method than this?  Besides of course storing an internal sqlite database or something of all the snapshot contents/changes.


----------



## Alt (Feb 1, 2010)

Its interesting feature. DFBSD's HAMMER have `hammer diff` feature, and its will be great if ZFS would have same thing.

About script: maybe it will be good to use find(1) option -newer:

```
-newer file
             True if the current file has a more recent last modification time
             than file.
```
So we can do something like

```
find /mnt/storage1/ -newer /mnt/storage1/.zfs/snapshot/yesterday
```
But this we can find only modified or created files, not deleted. Perhaps must use other utility for this, or somehow adapt '-ok' or '-exec' key


----------



## Alt (Feb 1, 2010)

Finally, my suggestion:

List of new/modified files:

```
find /mnt/store1/ -newer /mnt/store1/.zfs/snapshot/yesterday -type f
```
List of files that deleted:

```
cd /mnt/store1/.zfs/snapshot/yesterday
find . -type f -exec sh -c "stat -q /mnt/store1/{} >/dev/null || echo {}" \;
```


----------



## avilla@ (Feb 1, 2010)

Alt said:
			
		

> Its interesting feature. DFBSD's HAMMER have `hammer diff` feature, and its will be great if ZFS would have same thing.



there is a feature request about this on the zfs website... it's quite old, let's hope they move on this


----------



## kungfujesus (Feb 2, 2010)

Unfortunately that relies on access times, which most people who use ZFS turn off (to remove the overhead).


----------



## avilla@ (Feb 2, 2010)

xzhayon said:
			
		

> there is a feature request about this on the zfs website... it's quite old, let's hope they move on this



found it: http://bugs.opensolaris.org/view_bug.do?bug_id=6425091


----------



## Alt (Feb 2, 2010)

kungfujesus said:
			
		

> Unfortunately that relies on access times, which most people who use ZFS turn off (to remove the overhead).


Its not atime its modification time, so using -newer is ok. For excample

```
# find d2 -newer d1 -type f
d2/file2
# find d2 -anewer d1 -type f
# cat d2/file1 > /dev/null
# find d2 -newer d1 -type f
d2/file2
# find d2 -anewer d1 -type f
d2/file1
# touch d2/file2
# find d2 -newer d1 -type f
d2/file2
# find d2 -anewer d1 -type f
d2/file1
d2/file2
```
See -newer shows only modified file not accessed


----------



## kungfujesus (Mar 2, 2010)

Alt, your solution doesn't seem to be working for me, but it could be due to FreeBSD not supporting unicode yet.


----------



## graudeejs (Mar 2, 2010)

kungfujesus said:
			
		

> Alt, your solution doesn't seem to be working for me, but it could be due to FreeBSD not supporting unicode yet.



??? FreeBSD does support Unicode....
The only thing, FreeBSD console doesn't support unicode (but there there is project to fix that)


----------



## Alt (Mar 2, 2010)

kungfujesus said:
			
		

> Alt, your solution doesn't seem to be working for me, but it could be due to FreeBSD not supporting unicode yet.


I think you talking different crap just for explain why you will stay use your super-script :e


----------



## kungfujesus (Mar 2, 2010)

Would you like me to show you it not working?

http://pohl.ececs.uc.edu/~adam/findcurrent.txt <-- that's the output of:
find /mnt/share/backup/downstairs/music/ -type f 

http://pohl.ececs.uc.edu/~adam/findsnap.txt <-- that's the output of: find /mnt/share/backup/downstairs/music/.zfs/snapshot/021410/ -type f

http://pohl.ececs.uc.edu/~adam/findnewer.txt <-- this is the output of: find /mnt/share/backup/downstairs/music/ -newer /mnt/share/backup/downstairs/music/.zfs/snapshot/021410/ -type f 

You can see that it's showing identical files in both the snapshot and the current run of find.  The modification times should be identical, as I have not been touching anything, including tags.  The files are no different.  Explain that, smarty pants 

My script, however slow, is working.  I _wish_ that there was a working better alternative presented in this thread.  If anyone has one please let me know, I'm very open to suggestions.

This is what my script is outputting, and it is correct:

http://pohl.ececs.uc.edu/~adam/diffingoutput.txt

And yes Rihanna's on there, it wasn't for me


----------



## Alt (Mar 2, 2010)

This is specially for you, my dear pants :stud

```
# cp -r /usr/local/etc /test/
# cd /test/
# ll etc/
total 12
drwxr-xr-x  2 root  wheel     2 Mar  2 21:54 pam.d
drwxr-xr-x  2 root  wheel     3 Mar  2 21:54 rc.d
-r--r--r--  1 root  wheel  1443 Mar  2 21:54 slsh.rc
-r--r--r--  1 root  wheel  4224 Mar  2 21:54 wgetrc.sample
-r--r--r--  1 root  wheel   339 Mar  2 21:54 xml2Conf.sh
-r--r--r--  1 root  wheel   232 Mar  2 21:54 xsltConf.sh
# zfs snapshot test@s1
# rm etc/xsltConf.sh
# touch etc/wgetrc.sample
# :>etc/tempfile
# rm etc/tempfile
# find . -newer .zfs/snapshot/s1 -type f
./etc/wgetrc.sample
# cd .zfs/snapshot/s1
# find . -type f -exec sh -c "stat -q /test/{} >/dev/null || echo {}" \;
./etc/xsltConf.sh
# ^D..exit
```
So, i guess something touching your files or its incorrect mount setup


----------



## kungfujesus (Mar 2, 2010)

Likely it's rsync, although it's not actually modifying the files.  But again, if it doesn't work after rsyncs, then it doesn't really work in this circumstance.


----------



## Alt (Mar 2, 2010)

Try -u option for rsync


----------



## kungfujesus (Mar 3, 2010)

That doesn't necessarily guarantee a perfect sync, though.  I want checksums to match with what's on the host.  This implies the OS that it's syncing from has proper modification times.


----------



## Alt (Mar 3, 2010)

Use -c key then huh


----------



## kungfujesus (Mar 3, 2010)

can -c and -u be used together?  They're two different sync modes as far as I can tell.  I'm currently using -a, which I believe uses fast checksums.

killasmurf86: sort, uniq, and comm and freebsd all don't properly support unicode.  It wouldn't surprise me if rsync had issues, too.


----------



## Alt (Mar 4, 2010)

kungfujesus said:
			
		

> I'm currently using -a, which I believe uses fast checksums.


There is nothing about this in man.. Check mtime on host filesystem, maybe something touching files


----------



## kungfujesus (Mar 4, 2010)

It's an smbfs file system, I'm not even sure that -mtime is a parameter for mount_smbfs


----------



## phoenix (Jul 8, 2011)

Just an update on this:  ZFSv28 (now in 8-STABLE and -CURRENT) supports the "zfs diff" command, which will show you every file that changed between two snapshots (or between a snapshot and the current state of the filesystem).  Since it runs inside the data management layer of ZFS, it's *extremely* fast (9.43s to do a diff between two backups snapshots and grep out files that match a specific server; 4.21s redirecting the output to /dev/null).

Shows each file that changed along with a prefix showing what changed:

*+* file was added to the later snapshot (doesn't exist in the earlier)
*-* file was removed from the later snapshot (only exists in the earlier)
*R* file was renamed in the later snapshot
*M* file was modified in some way in the later snapshot

More info here.


----------

