# too many spontaneous reboots



## cbrace (Mar 13, 2011)

Hi all,

A few months ago I built a new box to function as a home/office mail/web/file server and LAN/ADSL gateway to replace the previous system, a nearly ten-year old Pentium 4 box. The new system has an AMD Athlon II X2 250 CPU and I installed FreeBSD 8.1 x86_64 on it.

For some reason, this new server tends to spontaneously reboot fairly frequently. I don't care about long uptimes, but it tends to reboot at inconvenient moments, such as during a recent attempt to use system-upgrade to bring the system up to v8.2, which left the upgrade in a partially completed state. 

I have just run memtest+ for more than 24-hours (48 passes) and no errors turned up, so I don't think it is a RAM problem.

Any suggestions how I might try to determine what is going on here? There is no indication at all in the system log.

FWIW, the old Pentium 4 server, although far slower and less energy efficient, only rebooted itself only a couple of times during the three or so years I used it.

Thanks.


----------



## wblock@ (Mar 13, 2011)

A FreeBSD system should reboot only on command.  Please describe the hardware: motherboard, disk drives, power supply, video board.  Do other operating systems run reliably on it?

How often is "fairly frequently"?  Does it reboot when it's busy or idle?  At a certain time of day?  What happens--does it show a panic?

What software is installed on FreeBSD?  xscreensaver has some OpenGL modules that can be a problem.


----------



## jnbek (Mar 13, 2011)

I had the same problem a few years back with an old Classic Pentium. Everytime I would compile something, the system would just reboot. Turns out the CPU fan had stopped working and the reboots were due to the CPU overheating and the system rebooting to protect itself. I replaced the fan and reboots went away.


----------



## cbrace (Mar 13, 2011)

wblock said:
			
		

> A FreeBSD system should reboot only on command.  Please describe the hardware: motherboard, disk drives, power supply, video board.


Motherboard is a Gigabyte M68M-S2P

HD 1 is a Western Digital Caviar 320 GB
HD 1 is a Samsung 2 TB

Power supply is a generic 450 watt.

Onboard video is nVidia GeForce 7025.



> Do other operating systems run reliably on it?


I have only tried FreeBSD.



> How often is "fairly frequently"?  Does it reboot when it's busy or idle?  At a certain time of day?  What happens--does it show a panic?


Maybe once every week or two at random times. It appears happens when the system is under a certain load. No error message at all.



> What software is installed on FreeBSD?  xscreensaver has some OpenGL modules that can be a problem.


I am not running X. Just the usual suspects: lighttpd, postfix, dovecot, that kind of thing.


----------



## cbrace (Mar 13, 2011)

jnbek said:
			
		

> I had the same problem a few years back with an old Classic Pentium. Everytime I would compile something, the system would just reboot. Turns out the CPU fan had stopped working and the reboots were due to the CPU overheating and the system rebooting to protect itself. I replaced the fan and reboots went away.


This is a possibility which I hadn't thought of, thanks. I am running powerd and most of the time it steps the CPU down to 200-300 Mhz; only when I am compiling etc does it crank up to the maximum 3Ghz. Perhaps it is overheating at these moments.

However, I am having trouble getting an accurate temperature measurements for this CPU. Alas, the sysctl way doesn't work: 

```
$ sysctl hw.acpi.thermal.tz0.temperature
sysctl: unknown oid 'hw.acpi.thermal.tz0.temperature'
```
Following a couple of other threads I found here (determine CPU temperature, CPU Temperature correct?), I have installed both mbmon and lmmon, but they both return what are certainly spurious values:

```
$ mbmon   

Temp.= 211.0,  0.0,  0.0; Rot.=  869, 21093, 23275
Vcore = 1.04, 1.89; Volt. = 3.38, 6.85, 11.61, -14.19, -6.12
```


```
lmmon -i

 Motherboard Temp               Voltages

 211C / 411F / 484K        Vcore1:   +1.016V
                           Vcore2:   +1.844V
    Fan Speeds             + 3.3V:   +3.297V
                           + 5.0V:   +6.654V
    1:  869 rpm            +12.0V:  +11.938V
    2: 21093rpm            -12.0V:  -15.938V
    3: 23275rpm            - 5.0V:   -6.654V
```
211C is of course impossible! The max is something like 70C, no? I just stuck my hand in the case and the heat sink on the CPU is barely warm.

If anyone has any ideas how to get an accurate temperature reading from this dual-core Athlon I'd be most grateful.


----------



## wblock@ (Mar 13, 2011)

The 5V supply being shown as nearly 7V is more concerning, since 3.3V and 12V seem right.  I'd check that with a meter or with a utility under another OS.  Or maybe HDT can show it.


----------



## jem (Mar 13, 2011)

Old power supplies can deteriorate and deliver less power than required.  This may well result in spontaneous reboots.


----------



## disi (Mar 15, 2011)

I had the excact same problem. First running Linux kernel (3 different releases) on the hardware (Intel Atom) showing ~20C and occationally rebooting, then running FreeBSD 8.2 and it shows ~30C (coretemp.ko) and still reboots itself.

Turns out, the BIOS shows ~40-50C and was set to shut down at 70C. The CPU is fanless so this could really happen. However it doesn't often has really something to do, so I disable the autoshutdown in the BIOS and had no problems since.

Do not trust those kernel drivers to show the correct temperature


----------



## jnbek (Mar 16, 2011)

I am not sure that turning off the auto-reboot setting is a wise idea. Seems to me it'd be better to just install a fan, especially since they're so inexpensive in comparison to say... a new CPU.


----------



## sch (Mar 25, 2011)

I had the similar problem.
Turn off *powerd* and check if the issue goes away.


----------



## cbrace (Apr 17, 2011)

sch said:
			
		

> Turn off *powerd* and check if the issue goes away.


I took your advice I haven't had any spontaneous reboots since then (current uptime is ca. 35 days).

It looks like the problem may well indeed be *powerd*.

Does anyone have experience with sysutils/cpupowerd?

From the web site:


> A daemon which controls the frequency and voltage of CPUs.
> 
> This userland program adjusts the frequency and voltage according to the CPUs load.
> Its capabilities include overvolting as well as undervolting.
> Currently it supports only AMD K8 processors like Athlon, Athlon64 (X2), Sempron, Opteron, Turion ...



This system has a AMD Athlon II X2, so that latter restriction isn't a problem...


----------



## cbrace (Apr 17, 2011)

I thought I'd give cpupowerd a try.

After 'make install' the following is displayed:

```
To generate a safe dafault config for you cpu you can use the "-a" or
"--autoconfig" switch.

cpupowerd -a /usr/local/etc/cpupowerd.conf
```


```
# cpupowerd -a /usr/local/etc/cpupowerd.conf 
cpupowerd 0.2.1 written by Markus Strobl.
WARNING: This program could cause damage to your Hardware!
Opening cpuid file /dev/cpuctl0 for reading failed!
Initialisation failed!
```

The README, which obviously I should have read first, says:



> FreeBSD:
> The devcpu kernel modul from ports/sysutils must be installed
> before(!) you begin to compile cpupowerd.



OK, never too late:


```
/usr/ports/sysutils/devcpu]# make all install clean 
===>  devcpu-0.8.3 already included into base system.
*** Error code 1

Stop in /usr/ports/sysutils/devcpu.
```

Huh? Anyone know what is going on here?

Thanks


----------



## Zare (Apr 18, 2011)

`# kldload cpuctl`

(stupid three char limit...)


----------



## cbrace (Apr 21, 2011)

OK, the kernel module is now loaded.

Alas, running *cpupowerd* generates an error message:

```
Initialisation failed!
```
I'll write to the developer; if I hear anything back from him, I will post it here.


----------



## cbrace (Jun 20, 2011)

Hi all,

In response to my original posting in March, several of you made some useful suggestions. Switching off powerd seems to have helped.

However, several days ago I was re-compiling mysql55-server and a segmentation fault caused the system to crash and reboot.

This is not the first time I have seen a segmentation fault, and I am wondering what the cause is. Can anyone suggest a way of determining what the problem here may be?

I don't have a fetish about long uptimes; it is just that I'd rather reboot the system myself 

Thanks.


----------



## wblock@ (Jun 20, 2011)

Swap the power supply, or test with only one DIMM at a time.  Don't do both at the same time.


----------

