# poudriere never completes first run without errors



## mast07 (Dec 10, 2020)

Hi,
I'm using poudriere to build some ports where i like to alter the default options (e.g. firefox). When i then start poudriere to build the whole packges, it does never just build them. It almost always comes all the way down to build at least one (or more) llvm-versions and rust simultaniously, and thats the point where the build of one of those ports fails (most cases rust failes to build). The log shows that it got a signal 9 (SIGKILL) but i cannot figure a reason for that. If i start pudriere once again (llvm has finished without error in most cases), it will build the remaining packages (including rust) without failing.
I assume it has something to do with the simultaneous compilations of those heavyweight ports llvm and rust. My build-server is a HP-Z400 workstation with 12 Cores and 16GByte RAM + 10GByte swap (which is in most cases only filled with some MB).

Any ideas what causes the abort of those builds? Could the jail hit some fd-limitations?


----------



## SirDice (Dec 10, 2020)

mast07 said:


> The log shows that it got a signal 9 (SIGKILL) but i cannot figure a reason for that.


Resource starvation is the most likely culprit. I have the same issue.


----------



## mast07 (Dec 10, 2020)

Hm...
dmesg shows me that line:


```
pid 23078 (fabricate), jid 6, uid 65534, was killed: out of swap space
```

But i do have 10GByte swap space which is barely touched. How come?


----------



## diizzy (Dec 10, 2020)

Building rust is _very_ resource heavy, I think you need 8Gb+ RAM (16Gb is recommended) to compile Rust these days (for less than 4 jobs).


----------



## SirDice (Dec 10, 2020)

diizzy said:


> Building rust is _very_ resource heavy


It is. This doesn't improve when poudriere tries to build one or more versions of llvm at the same time. Then things get really messy and suck up a LOT of resources.


----------



## mast07 (Dec 10, 2020)

I do now understand that it is really heavy work for the buildservers to build llvm and rust, especially simultaniously. On the other hand was I under the impression, i would provide some hardware to lift some of the heavy weigth (16GB RAM + 9GB swap). Are there any tips where i could tune some bits to keep the build running (even if it takes longer to finish the build)? It is a bit annoying to start a build in the evening to have them built overnight just to see that it was aborted two hours after start, leaving a huge chunk of ports in the skipped list. Even rust alone is sometimes aborted now


----------



## T-Daemon (Dec 10, 2020)

mast07 said:


> Are there any tips where i could tune some bits to keep the build running


Increase swap space.

Handbook,  4.6. Building Packages with Poudriere


```
The number of processor cores detected is used to define how many builds will run in parallel.
Supply enough virtual memory, either with RAM or swap space. If virtual memory runs out, the compilation
jails will stop and be torn down, resulting in weird error messages.
```


----------



## Jose (Dec 11, 2020)

Do you have `ALLOW_MAKE_JOBS` enabled in your poudriere.conf? Have you set `MAKE_JOBS_NUMBER` in your make.conf?

Edit: Are you using ccache(1)?


----------



## mast07 (Dec 11, 2020)

T-Daemon 
I've found an older 160GByte HDD lying around and added it as swap device. Let's see if this helps...

Jose 
Yes, i do use ALLOW_MAKE_JOBS, but let the default value for MAKE_JOBS_NUMBER untouched.

I've had used ccache, but that did not work well for me. It did lock up the whole machine after switching FreeBSD Release and rebuilding the packages without deleting the ccache-folder first.


----------



## SirDice (Dec 11, 2020)

Jose said:


> Do you have ALLOW_MAKE_JOBS enabled in your poudriere.conf?


Rust seems to completely ignore this and use all cores regardless. As far as I know this is intentional. Even with all cores it takes forever to build, if it would only use one core we'd be talking about build-times in days instead of hours.


----------



## Jose (Dec 11, 2020)

mast07 said:


> Yes, i do use ALLOW_MAKE_JOBS, but let the default value for MAKE_JOBS_NUMBER untouched.


This didn't work for me. Builds would have lots of workers killed by the kernel, and would usually eventually die. The limiting factor is memory. I only configured 4GB of swap when I built this machine, a decision I regret.

There are lots of suggestions in this thread:








						HOWTO: Speeding up poudriere build times
					

Using ports-mgmt/poudriere with devel/ccache, tmpfs and parallel jobs can speed up your package build times. How much depends entirely on your hardware. Keep in mind there are quite a few pages out on the Internet about this so when I did my set up, I pulled from several.  My aim for this HOWTO...




					forums.freebsd.org
				




You could try using `ALLOW_MAKE_JOBS_PACKAGES` as Sirdice suggests in that thread. I didn't want to maintain the list of ports that are allowed to run in parallel.

By default, both poudriere(8) and the Ports system use sysctl(8) to determine how many tasks to run in parallel. They use slightly different MIBs, but these report the same number on my system, which works out to two times the number of cores because each core has two threads.

So I wound up with 32 poudriere(8) workers each possibly running 32 make(1) tasks. I never quite hit 1024 processes, probably because a lot of ports don't invoke parallel make(1), and because most builds have significant non-parallel stages (automake comes to mind). My load average did break 100, and I did get lots of killed processes in my dmesg(8), though. The problem was memory starvation. The system was still semi-responsive.

I tuned the parallelism of my builds because I didn't want to curate a list of blessed ports, and I didn't want to have 15 cores sitting idle most of the time. The approach I came up with was to set `PARALLEL_JOBS=16` in poudriere.conf and `MAKE_JOBS_NUMBER=16` in make.conf. In practice my load average never hits 256, but works out to a max of 64 or so for the particular set of ports I build, and mostly hovers around 16 during the build. I still get some OOM kills, but the poudriere(8) runs usually finish if I quit Thunderbird, and don't open too many tabs in Firefox.

I wouldn't copy these numbers into your configuration files. They probably only work for my hardware and workload.



mast07 said:


> I've had used ccache, but that did not work well for me. It did lock up the whole machine after switching FreeBSD Release and rebuilding the packages without deleting the ccache-folder first.


That's too bad, ccache(1) made a huge difference for me, especially when building multiple versions of LLVM at the same time.



SirDice said:


> Rust seems to completely ignore this and use all cores regardless. As far as I know this is intentional. Even with all cores it takes forever to build, if it would only use one core we'd be talking about build-times in days instead of hours.


I've noticed this too, but it seems to me Rust is running multiple threads instead of processes. Still obnoxious, but more memory-efficient.

Edit to add


mast07 said:


> I've found an older 160GByte HDD lying around and added it as swap device. Let's see if this helps...


This is almost certainly a bad idea. A 160 GB HDD is likely truly ancient and therefore glacially slow. My guess is you'll trade OOM kills for timeouts.


----------



## acheron (Dec 11, 2020)

Do you have USE_TMPFS (not sure of the name) in poudriere.conf?


----------



## Jose (Dec 12, 2020)

acheron said:


> Do you have USE_TMPFS (not sure of the name) in poudriere.conf?


Good question. Using tmpfs(5) will cause memory contention. I have `USE_TMPFS=data`. I figured this was the best combination for speed and efficient use of memory.


```
# Use tmpfs(5)
# This can be a space-separated list of options:
# wrkdir    - Use tmpfs(5) for port building WRKDIRPREFIX
# data      - Use tmpfs(5) for poudriere cache/temp build data
# localbase - Use tmpfs(5) for LOCALBASE (installing ports for packaging/testing)
# all       - Run the entire build in memory, including builder jails.
# yes       - Enables tmpfs(5) for wrkdir and data
# no        - Disable use of tmpfs(5)
# EXAMPLE: USE_TMPFS="wrkdir data"
```
The default is "yes" (wrkdir and data). Maybe I should just turn it off since I have an NVMe drive.


----------



## Jose (Dec 13, 2020)

Thinking more on this, it would be nice if you could give Poudriere a target load average and it could tune its parallelism accordingly. Or maybe a max load average and have it cut back on the number of workers as it approached this limit.


----------



## rigoletto@ (Dec 13, 2020)

Run ports-mgmt/poudriere with just one job for those problematic things first. Later run the whole build you need with the optimizations[1].

`poudriere bulk -r -t -J 1:32 -j JAILNAME lang/rust devel/llvm80 devel/llvm90 devel/llvm10 lang/gcc9`

Adjust for your needs...

[1] all in the same jail, of course.


----------



## rigoletto@ (Dec 13, 2020)

mast07 said:


> But i do have 10GByte swap space which is barely touched. How come?


With 16GB RAM this is hard to believe the swap is barely touched. With Rust alone it would use at least a couple of GBs provided you are not using ZFS or has ZFS_ARC set to a minimal (or you have just restarted the computer).

Are you measuring the swap usage during the whole process, how?


----------



## Jose (Dec 13, 2020)

rigoletto@ said:


> `poudriere bulk -r -t -J 1:32 -j JAILNAME lang/rust devel/llvm80 devel/llvm90 devel/llvm10 lang/gcc9`


The man page is not terribly clear, but it seems that `-t` will disable parallel make as well. This, plus running only one Poudriere worker means at most one core will be used, except maybe by Rust which cheats. It would not be appropriate for a machine with more than a few cores.

Why the large number of premake jobs?


----------



## rigoletto@ (Dec 13, 2020)

Jose said:


> Why the large number of premake jobs?


I found it works ok in this scenario with mine 8 core processor.


----------



## rigoletto@ (Dec 13, 2020)

You can also try THIS version, which considerably improve the building time; however IDK if it was already merged or not (I've have not been following what is going on ultimately).


----------



## mast07 (Dec 14, 2020)

rigoletto@ said:


> Are you measuring the swap usage during the whole process, how?


I have to admit that this has been done unprofessionally by keeping an "observing eye"™  on an ssh-session running top(1). On the build server I'm using ZFS and ARC_MAX is set to 2GByte via /boot/loader.conf.


acheron said:


> Do you have USE_TMPFS (not sure of the name) in poudriere.conf?


I did not alter the default, which is USE_TMPFS=yes.


rigoletto@ said:


> Run ports-mgmt/poudriere with just one job for those problematic things first. Later run the whole build you need with the optimizations[1].
> 
> `poudriere bulk -r -t -J 1:32 -j JAILNAME lang/rust devel/llvm80 devel/llvm90 devel/llvm10 lang/gcc9`
> 
> Adjust for your needs...


I will try this for my next builds, thank you.


----------



## SirDice (Dec 14, 2020)

I also have 16GB of RAM, and have set a 16GB swap. Currently started a build around 10:00 this morning. Swap is barely touched during build (one of the jobs is currently building Rust). I do have TMPFS enabled.

Sidenote: I'm using ports-mgmt/poudriere-devel, that may be important.


----------



## OlivierW (Dec 14, 2020)

Is there a way to setup poudriere so it won't build rust, llvm9, llvm10 and other heavy weight ports?

I'm not changing any configuration of this 3 ports, I am only building others ports which may have this ones as dependencies, and so they get built… To me it would be OK to use the binary packages of rust or llvm*.

My poor server has "only" 16GB of RAM and plenty of free swap space, but the CPU isn't very powerful, so this ports takes between 4 and 6 hours each time they are modified. Sometimes ccache helps, but not always.
As SirDice said, the worst is when the three are building at the same time


----------



## SirDice (Dec 14, 2020)

So far, so good. Yes, I get double or even triple load (building Rust, LLVM, etc. at the same time) every now and then and it takes forever. But it does build without issues, and at least it doesn't crash the server any more. I do get failures sometimes and a bunch of ports being skipped because of it. I build often enough and restart builds frequently anyway, so it doesn't really bother me.


----------

