# What activities will lead to writing on the ZIL?



## t1066 (Apr 24, 2012)

I had been experimented with adding a separate log drive to my pool. And I get the following data after 15 days and 5 hours.


```
$ zpool iostat -v
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
myzfs       1.11T  4.33T      6      8   847K   520K
  raidz2    1.11T  4.33T      6      8   847K   484K
    gpt/akjc      -      -      2      2   145K   126K
    gpt/0773      -      -      2      2   173K   126K
    gpt/6062      -      -      1      2   112K   126K
    gpt/2651      -      -      1      2   129K   126K
    gpt/gvkc      -      -      2      2   174K   126K
    gpt/ja7h      -      -      2      2   129K   126K
logs            -      -      -      -      -      -
  [B]gpt/log0  1.41M  4.97G      0      0      0  36.3K[/B]
cache           -      -      -      -      -      -
  gpt/cache0   104G  24.7M     19      0  2.40M  90.3K
  gpt/cache1   109G  24.9M     20      0  2.44M  90.6K
----------  -----  -----  -----  -----  -----  -----
ssdzfs      2.63G  7.31G      0      0  13.1K    619
  gpt/ssd1  2.63G  7.31G      0      0  13.1K    619
----------  -----  -----  -----  -----  -----  -----
```

Note that the bold line above implies there are at least 45G data written onto the ZIL. But this pool is mainly for browsing and BT-ing. It does not do any NFS sharing. I had so far identified a few activities that may cause writes on the ZIL. The BT client net-p2p/transmission, file extraction, writing of logs, and the flushing of browser cache of www/firefox. I had now changed the sync property to disabled on a few filesystems. I know that sync is good for filesystem consistency, but on a filesystem that only contains downloaded files this should not cause any lasting damage. However, I would still like to know what other activities will lead to writing on ZIL?

PS. I remember before I added a separate log drive, there were times when I extracting a large archive, it would stalled for a second before finished. I can now confirm that this is indeed happening as now I had seen the size of used space on the log drive grew to a few hundred MB in such case. And after a few minutes, it would go back to around 1 MB.


----------



## gkontos (Apr 24, 2012)

Have a look at this article. It provides a good explanation and a good starter IMO.

Also, a bit outdated as well but very well written blog post about L2ARC.

I have used LOG devices with mysql server on innodb engine with 16KB data recordsize. The performance boost is really amazing.


----------



## Sebulon (Apr 25, 2012)

t1066 said:
			
		

> *gpt/log0  1.41M  4.97G      0      0      0  36.3K*
> 
> Note that the bold line above implies there are at least 45G data written onto the ZIL.



Whaaat? It has only allocated 1.41*MB* in total so far on that SLOG. How do you interpret that as 45*GB*?

/Sebulon


----------



## t1066 (Apr 25, 2012)

@gkontos

Thanks for the link to information on ZIL.

I think I had misstated the motivation for my question. When I started this experiment, my expectation was that only around 1-2% of writes will go onto the ZIL, not the 7% that I observed. So I try to look into the cause and see if I can eliminating some of the overhead.

PS. My L2ARC worked charmingly, providing reading bandwidth of 4.84MB/s which compared favorably with the 847kB/s from the main pool. And there were times when the main pool was idled and all IOs came from the L2ARC.


----------



## t1066 (Apr 25, 2012)

Sebulon said:
			
		

> Whaaat? It has only allocated 1.41*MB* in total so far on that SLOG. How do you interpret that as 45*GB*?
> 
> /Sebulon



You have to look at the last number 36.3K (/s), which is the average writing speed onto the SLOG. And 36.3KB/s*(15*24+5)*3600s would give around 45GB.

The ZIL was only needed when a crash occurred and you have to replay the last txg to recreate a consistent filesystem. Otherwise, when the data referred by a particular txg are written successfully on to the main pool, the txg will be discarded and the space is freed. So I had seen that the allocated space went from 1MB to 200MB and then back to 1MB again.


----------



## Sebulon (Apr 25, 2012)

@t1066

Ah, OK, got it.

Just a quick sidenote is that you always have a ZIL in ZFS, and when you donÂ´t have a SLOG as in this case, the ZIL is in the pool. But I think you knew that already, it was more FYI.

/Sebulon


----------



## phoenix (Apr 25, 2012)

t1066 said:
			
		

> I had been experimented with adding a separate log drive to my pool. And I get the following data after 15 days and 5 hours.
> 
> 
> ```
> ...



No, it doesn't.  It shows that 1.4 MB of data is in the ZIL, with 5 GB of free space in the ZIL, and that, on average since the system booted, writes to the ZIL happen at 36.3 KBps.  You can't use any of those numbers to extrapolate the usage of the ZIL based solely on the output of a single *zpool iostat* run.


----------



## t1066 (Apr 26, 2012)

phoenix said:
			
		

> No, it doesn't.  It shows that 1.4 MB of data is in the ZIL, with 5 GB of free space in the ZIL, and that, on average since the system booted, writes to the ZIL happen at 36.3 KBps.  You can't use any of those numbers to extrapolate the usage of the ZIL based solely on the output of a single *zpool iostat* run.



But as I mentioned at the start of the post, the system had run for 15 days and 5 hours when I ran the command. Simple mathematics would suggest that 45GB are written and discarded in this period. Why do you think that this is not true?


----------



## t1066 (May 8, 2012)

With sync set to disabled for a number of filesystems, the writes that go to the ZIL reduce drastically. It is now just 128 bytes/s, roughly 1/300 of previous bandwidth.


```
$ zpool iostat -v
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
myzfs       1.13T  4.31T     10      4  1.08M  63.7K
  raidz2    1.13T  4.31T     10      4  1.08M  63.6K
    gpt/akjc      -      -      5      1   259K  19.9K
    gpt/0773      -      -      4      1   246K  20.1K
    gpt/6062      -      -      4      1   250K  20.2K
    gpt/2651      -      -      4      1   262K  19.8K
    gpt/gvkc      -      -      5      1   257K  20.1K
    gpt/ja7h      -      -      5      1   257K  20.2K
logs            -      -      -      -      -      -
  gpt/log0  1.26M  4.97G      0      0      0    128
cache           -      -      -      -      -      -
  gpt/cache0  75.6G  28.4G     21      0  2.63M  69.2K
  gpt/cache1  76.4G  32.5G     21      0  2.66M  69.9K
----------  -----  -----  -----  -----  -----  -----
ssdzfs      3.10G  6.83G      0      0  15.2K    923
  gpt/ssd1  3.10G  6.83G      0      0  15.2K    923
----------  -----  -----  -----  -----  -----  -----
```

Moreover, running the following dtrace script for one day.


```
::fsync:entry
{
        @sync[execname] = count();
}

:::tick-1d
{
        exit(0);
}

END
{
        printa(@sync);
}
```

I got the result.


```
mail.local                                                        2
  transmission-qt                                                   4
  thunderbird-bin                                                   9
  vi                                                               10
  sendmail                                                         12
  VirtualBox                                                       15
  gconfd-2                                                         19
  syslogd                                                          25
  VBoxSVC                                                          27
  more                                                             80
  firefox-bin                                                     986
  midori                                                         2463
```

I get a basic grip on what are writing on the ZIL and how frequent are those writes. And if necessary, the above script can be readily modified to test the effect of other commands.


----------

