# CCACHE and poudriere: why so ineffective



## abishai (Sep 23, 2017)

I'm migrating from portmaster to centralized poudriere and have question about ccache. I built packages for my laptop, after that I deleted repository and build it one more time.

```
abishai@poudriere:~ % env CCACHE_DIR=/var/cache/ccache ccache -s
cache directory                     /var/cache/ccache
primary config                      /var/cache/ccache/ccache.conf
secondary config      (readonly)    /usr/local/etc/ccache.conf
cache hit (direct)                 16101
cache hit (preprocessed)           14592
cache miss                        223923
cache hit rate                     12.05 %
called for link                    22586
called for preprocessing           14685
multiple source files                 40
compiler produced stdout               4
compile failed                      5472
ccache internal error                 21
preprocessor error                  4812
can't use precompiled header          64
bad compiler arguments              1888
unsupported source language           50
autoconf compile/link              43201
unsupported compiler option         1989
unsupported code directive             6
no input file                      18145
cleanups performed                   200
files in cache                    147978
cache size                           4.4 GB
max cache size                       5.0 GB
```
Cache hit rate is awful and build time is roughly the same. Can it be improved?


----------



## abishai (Sep 23, 2017)

getopt said:


> might be young


How can it be young ? Cold cache -> repository built -> same repository rebuild. I thought hit ration will be 50% (0% hits for the first build, 100% for the second)
Maybe I did something wrong? I set CCACHE_DIR in poudriere.conf and that's all. It's used, according logs:

```
[laptop-default-job-02] Installing ccache-3.3.4_7...
[laptop-default-job-02] Extracting ccache-3.3.4_7: ....... done
Create compiler links...
create symlink for cc
create symlink for cc (world)
create symlink for c++
create symlink for c++ (world)
create symlink for CC
create symlink for CC (world)
create symlink for clang
create symlink for clang (world)
create symlink for clang++
create symlink for clang++ (world)
```


----------



## chrbr (Sep 23, 2017)

abishai said:


> How can it be young ? Cold cache -> repository built -> same repository rebuild. I thought hit ration will be 50% (0% hits for the first build, 100% for the second)


This sounds reasonable. But only if all source files fit into the cache.


abishai said:


> cleanups performed 200


This is the strange part of the output of `ccache -s`. Why are there so many cleanups?

Does poudriere.conf points to the correct ccache directory?   If yes you could try to build just a single moderate sized port twice using poudriere(8). The lines below bulk show how to do that. Then the effect should be noticable in the output of `ccache -s` and in the build time. I am sure that finally you will be able to bring the cache performance to a good level.


----------



## abishai (Sep 25, 2017)

I made such test with `poudriere testport -j laptop -o lang/python36` I've got sane results with 49,84% hits after the second try. Maybe, during bulk builds data at the beginning was evicted from cache ? And package rebuilding was not fitted in 'cache window' ? However, cache size is less than max cache size

```
cache size 4.4 GB 
max cache size 5.0 GB
```
I'm increasing cache size to 10G from 5Gb and going to repeat the process.


----------



## xtaz (Sep 25, 2017)

I see the same thing. I would be interested in your findings. I've always had a max cache size of 10GB and it's only about 1.4GB used but the hit rate used to be *far* higher when I used to use portmaster or synth. With poudriere it seems to do something which invalidates the cache which the others don't do. Poudriere gives me around 35% whereas synth and portmaster used to give me more like 60% I think.


----------



## chrbr (Sep 25, 2017)

abishai said:


> I made such test with  poudriere testport -j laptop -o lang/python36 I've got sane results with 49,84% hits after the second try.


That sounds perfect!


abishai said:


> Maybe, during bulk builds data at the beginning was evicted from cache ? And package rebuilding was not fitted in 'cache window' ?


Yes, I think so. Data gets kicked out of the cache before it can be re-used.


abishai said:


> However, cache size is less than max cache size
> 
> ```
> cache size 4.4 GB
> ...


I have the same size of cache and currently 4.4GB are in use, too. Occupied sections must be free before re-use. Therefore the process needs some headroom.


xtaz said:


> With poudriere it seems to do something which invalidates the cache which the others don't do. Poudriere gives me around 35% whereas synth and portmaster used to give me more like 60% I think.


May be because ports-mgmt/poudriere always rebuild everything related to modified ports, just to be on the safe side. My current rate with ports-mgmt/poudriere is 58%. There seems to be a best cache size depending on the number of ports and size of ports.


----------



## abishai (Sep 26, 2017)

chrbr said:


> Yes, I think so. Data gets kicked out of the cache before it can be re-used.


Looks like this is the issue for my previous builds. I increased cache size to 10GB and after the first build cache size exceeds 5GB boundary. Second build is running now and it's definitely receiving HUGE boost from cache hits.

```
abishai@poudriere:~ % env CCACHE_DIR=/var/cache/ccache ccache -s
cache directory                     /var/cache/ccache
primary config                      /var/cache/ccache/ccache.conf
secondary config      (readonly)    /usr/local/etc/ccache.conf
cache hit (direct)                 10360
cache hit (preprocessed)            3833
cache miss                         76051
cache hit rate                     15.73 %
called for link                     8428
called for preprocessing            3901
multiple source files                 17
compiler produced stdout               1
compile failed                      2333
ccache internal error                  7
preprocessor error                  1782
can't use precompiled header          16
bad compiler arguments               682
unsupported source language           23
autoconf compile/link              16443
unsupported compiler option          663
unsupported code directive             2
no input file                       6331
cleanups performed                     0
files in cache                    202112
cache size                           5.4 GB
max cache size                      10.0 GB
```


----------



## xtaz (Sep 26, 2017)

chrbr, Synth does the same thing, builds ports in a clean jail/chrooted environment. So it's a bit weird that synth hits 60% and poudriere only hits about 35%.

For me it's not a case of running out of cache.


```
cache hit (direct)                 19204
cache hit (preprocessed)            3180
cache miss                         41171
cache hit rate                     35.22 %
cache size                           1.8 GB
max cache size                      10.0 GB
```

I suppose really I need to do the same kind of tests. Do something like clean the cache. Do a full poudriere run with a deleted repository, delete it again, and then see what the stats say. You would expect to see mostly hits and not misses in that scenario.


----------



## chrbr (Sep 26, 2017)

xtaz said:


> Synth does the same thing, builds ports in a clean jail/chrooted environment. So it's a bit weird that synth hits 60% and poudriere only hits about 35%.


The only difference is that ports-mgmt/synth can use pre-compiled packages from the FreeBSD servers if the options are default. I would not wonder if some of the huge ports have some automatic code generation involved which would pollute the cache with different code for the same functionallity. Beside of that I agree with you, ports-mgmt/synth and ports-mgmt/poudriere should lead to similar results.


----------



## xtaz (Sep 27, 2017)

OK I take it back. I've just done a proper test. Deleted my entire ccache directory, deleted my entire poudriere repository, and then let it build it. It had a cache hit rate of 6% (surprising!). Then I deleted the poudriere repository again and let it rebuild it. This time it has a cache hit rate of 53%. So actually that's what you would expect.

I think what I may have been seeing is that when I had synth I did a lot of that type of thing, deleting the repo and rebuilding it. Whereas on poudriere I've not really touched it and just been letting it build things as required.


----------

