# question about webserver in jail



## wonslung (Jun 5, 2009)

Is there anything you need to do different when setting up a webserver in a jail?
the reason i ask is i have 2 nearly identical systems, one using jails one not using jails.

The system that i have set up not using jails works fine, the system using jails has this weird issue where eveyrthing works fine until you start the webserver.  I thought it might be configuration based and i guess it STILL might but heres what happens:

When you start the webserver everything is fine at first but this lag starts buiding..pages load slower and slower as time goes on and the entire system lags out until you stop the webserver.....i've tried everything i know to try....tried 3 different webservers as well (nginx, apache and lighttpd) all 3 do the same thing.....i was thinking it might be php but i've set the php.ini and xcache EXACTLY the same as i did on the working system.

it's freebsd 7.2
i have php installed from ports with the default php.ini 

any help would be appreciated


----------



## BuSerD (Jun 6, 2009)

Sounds like your jail is exceeding its resource limits. Have a look at jtune for adjusting that. I little background is provided at http://wiki.freebsd.org/JailResourceLimits


----------



## wonslung (Jun 7, 2009)

nah, that's not what was happening.

it's a quad core xeon with 8gb ram.

I reninstalled and i think things are working better....

this "slow down" only happened when the web server was running which makes me think maybe it was something to do with php...i don't know...but it was just odd.


----------



## vivek (Jun 7, 2009)

Any errors in httpd error log file? What about total connections (connection per seconds)? Turn on mod_status and get info about connections. If there are too many connections Apache will slow down or just die out. Error log is first place to look out for errors.


----------



## wonslung (Jun 7, 2009)

hey buserd, i missunderstood what you were saying....

i wasn't aware that limits were automatically attached to the jails....is there anyways to disable this function and just allow them to use as much as they need or how can i set it?


----------



## vivek (Jun 7, 2009)

Don't worry it is not part of 7.2 system. You do not have any limits on jail. The patch does not work with 7.2 release. Your problem is somewhere else...


----------



## wonslung (Jun 7, 2009)

well that is odd.

Seriously, as a test i did this:

On the host system, i set up apache mysql and php, my 5 sites and ran them.  they run fine.  quick, perfect.  Shut them down, enable the jail which has the EXACT same ports with the same options installed, reboot (or start the jails, i've done both)  and they run ok for about 3 minutes then get slower and slower.

I'm totally lost as to what's causing it...

Also, i've tried using ezjails, and i've also tried making them from scratch.

It only seems to happen with webservers but it happens with lighttpd and apache running in the jail.  When this lag occurs it effects the entire system, slowing down ssh and everything....i don't doubt that the problem is somewhere else, as you say, but i'm going insane trying to find WHERE.


----------



## vivek (Jun 7, 2009)

Did you read my first post about httpd log file and connections?


----------



## wonslung (Jun 7, 2009)

yes, i DID read that but why would it work on the same system outside of the jail but not work inside, when it's setup the same way.


hold on, i'll check my error log again
	
	



```
[Sun Jun 07 11:37:22 2009] [warn] child process 7762 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:23 2009] [warn] child process 7763 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:23 2009] [warn] child process 17599 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:23 2009] [warn] child process 852 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:23 2009] [warn] child process 74636 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:24 2009] [warn] child process 7762 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:24 2009] [warn] child process 7763 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:24 2009] [warn] child process 17599 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:24 2009] [warn] child process 852 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:24 2009] [warn] child process 74636 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:26 2009] [warn] child process 7762 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:26 2009] [warn] child process 7763 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:26 2009] [warn] child process 17599 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:26 2009] [warn] child process 852 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:26 2009] [warn] child process 74636 still did not exit, sending a SIGTERM
[Sun Jun 07 11:37:28 2009] [error] child process 7762 still did not exit, sending a SIGKILL
[Sun Jun 07 11:37:28 2009] [error] child process 7763 still did not exit, sending a SIGKILL
[Sun Jun 07 11:37:28 2009] [error] child process 17599 still did not exit, sendi
```

i've seen this before, i THINK that's the php bug where closing apache makes things drop down to port 80 and not close, i remember because everytime i have to restart apache i have to manaually kill the processes...but if it's something else i'm dying to know


my main concern is that on the exact same system, running apache or lighttpd in a jail causes serious lag after 2-3 minutes.....i am mainly trying to find out if there is any obvious "you're a stupid newb" answer to my problem...looks to be more complicated....


----------



## anomie (Jun 11, 2009)

Please post all the host system's rc.conf directives related to your jail. I am curious to see the devfs ruleset you're using.


----------



## ahankinson (Nov 21, 2009)

I'm hoping someone will have found an answer to this. It's been bugging me too, and I can't seem to narrow down the problems.


```
user@machine: $ uname -a
FreeBSD machine 7.2-STABLE FreeBSD 7.2-STABLE #0: Thu Nov 12 18:15:13 EST 2009     root@machine:/usr/obj/usr/src/sys/GENERIC  amd64
```

Hardware: Dell PE1950, connected via Fibre Channel to XServe RAID array. 9G RAM, Quad-Core processor.

Jails are used primarily for isolating separate web server instances. A typical example will be MySQL, PHP5 and NginX with FastCGI.

The symptoms are the same as the OP describes: Web server runs fine for a few minutes, then gets unbearably slow. After a while it picks up again, runs fine for a few minutes, and then gets slow again.

Thing is, everything gets slow: Logging in to the jail via ezjail-admin console is slow, SSH response is slow. The host prompt is not slow, however, leading my to think it's something wrong with my Jails setup.

Are there any diagnostics I can run? Any place to look for clues? I've done a lot of things that I think are dancing around the problem: upgraded to ZFS v13 (for better disk performance), increased the amount of memory available to PHP. 

Any help would be greatly appreciated.


----------



## wonslung (Nov 21, 2009)

I've since got everything working...i forget exactly what the problem was back then.

one thing i DID do differently was make a single jail for mysql and use only one instance of mysql.  If you're running multiple instances of mysql that could be it.

Also, if you're using mysql on ZFS you should read up on optimizing your zfs filesystems for mysql.  Basically, using a sepearate ZFS filesystem per database or at least one zfs filessytem for ALL databases, that way you can match the recordsize to the blocksize

One of the cool things about using a single zfs filesystem for each db is that you can use snapshots.


----------



## ahankinson (Nov 22, 2009)

The nature of our environment means that I have to run mysql in separate jails - we're a research lab, and sometimes we need to "reboot" one system, with as little impact on the other systems on the same machine. (We're in a bit of a constant netherworld between "production" and "development." ) We were running a single unified MySQL system before, and had constant disruptions - something that I was hoping would be alleviated with jails. 

Here's what I've verified/noticed:
1. It's not a firewall problem. The slowdowns occur when the host's firewall is completely down.
2. It's sporadic. Sometimes (like now) the system is perfectly fine and responsive. Other times, it's dog slow.
3. It's systemic. When it slows down, *everything* gets slower, from logging in via the ezjail-admin console to listing files in a directory to opening a man page. (Which is why I'm not sure that it's just MySQL...)
4. Other jails on the system exhibit these symptoms, but never the host system. It's always responsive.

We're only running about 7 or 8 jails, and I've checked CPU usage, disk throughput and memory swapping - everything is well under the limits.

The only thing I can think of is that I have some configuration in the networking / interfaces wrong.

Here's the output of "ifconfig -a" (.10 & .30 are the un-aliased IPs on each card; the rest are aliased):


```
bce0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4>
	ether 00:19:b9:e4:3d:00
	inet6 fe80::219:b9ff:fee4:3d00%bce0 prefixlen 64 scopeid 0x1 
	inet 123.45.15.28 netmask 0xffffffff broadcast 123.45.15.255
	inet 123.45.15.29 netmask 0xffffffff broadcast 123.45.15.255
	inet 123.45.15.31 netmask 0xffffffff broadcast 123.45.15.255
	inet 123.45.15.30 netmask 0xffffff00 broadcast 123.45.15.255
	media: Ethernet autoselect (1000baseTX <full-duplex>)
	status: active
bce1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4>
	ether 00:19:b9:e4:3c:fe
	inet6 fe80::219:b9ff:fee4:3cfe%bce1 prefixlen 64 scopeid 0x2 
	inet 123.45.15.10 netmask 0xffffff00 broadcast 123.45.15.255
	inet 123.45.15.25 netmask 0xffffffff broadcast 123.45.15.255
	inet 123.45.15.26 netmask 0xffffffff broadcast 123.45.15.255
	inet 123.45.15.27 netmask 0xffffffff broadcast 123.45.15.255
	media: Ethernet autoselect (1000baseTX <full-duplex>)
	status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	inet6 ::1 prefixlen 128 
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 
	inet 127.0.0.1 netmask 0xff000000
```

The reason I think I might have my routing wrong is that sometimes, when it's really slow, NginX will time out with a "504 Bad Gateway" error, which I suspect means that something is causing it to not communicate with the PHP Fast-CGI process. (listening on 127.0.0.1:9000 within the jail). 

I'm grasping at straws right now, since I'm not sure where to check next. Any gurus out there what I can do to try to pinpoint the slowdown?


----------



## vivek (Nov 22, 2009)

Are you using firewall to route traffic between jails? The best I can recommend it to run tcpdump when slow things started on your system. I've also noticed that same range IPs are assigned to different interface. Can you paste your rc.conf and firewall rules?


----------



## DutchDaemon (Nov 22, 2009)

I don't understand why your 0xffffffff netmasks have a .255 broadcast address. Have you set broadcast addresses manually? Your aliases should have broadcast == ip alias. And having overlapping networks on two interfaces is generally not advisable. So yes, we need more data, including a `$ netstat -rn`, a `$ route -n get 123.45.15.30` and a `$ route -n get 123.45.15.10`.


----------



## ahankinson (Nov 27, 2009)

Thanks for the replies everyone.

It seems I did have the routing screwed up. I've modified my setup so that all IPs on a subnet are on a single network connection, and changed the broadcast addresses. Everything, including MySQL, seems to be running without a hitch now.


----------

