# Reboot when running sysctl -a



## Hornpipe2 (Jan 4, 2013)

Here's an odd one.

This started as an automatic reboot every 30 minutes.  It appeared something was kicking off at boot and 30mins later it would restart.  Or, ACPI would somehow kick in after 30mins to disable the screen or something, which then caused either a freeze or a straight reboot.

Started attempting to diagnose the problem and I got a weird one.  I can ssh into the machine, poke around with `$ ls`, run `$ top`, etc etc.  But running `$ sysctl -a` begins printing things to the console and then it just hard reboots!

I have a suspicion of HDD corruption: fsck(1) found problems with /var/run, though it now marks the fs as "clean".  Suggestions?  Or should I just begin the backup-and-reformat procedure?

ETA:  Tried to disable ACPI using the hint. tunable in loader.conf.  Still hangs / reboots.


----------



## chatwizrd (Jan 4, 2013)

You should look at /var/log/messages or dmesg for any errors. You also might want to enable core dumps in /etc/rc.conf. Not sure where the proper guide is...maybe http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html


----------



## cpm@ (Jan 4, 2013)

Which version is installed? Have modified the kernel of your system? 

Please, show your system information:

`# uname -a`

Maybe rebuild world and kernel can resolve your problem. Try compiling a GENERIC kernel, is needed check your  CUSTOM config file to rule out errors.

Use pastebin.com service to paste you problematic KERNEL config file.

Try remove "sysctl -a" from /etc/rc.d/initrandom so verify if it rebooted in a similar interval time (only for test).

`# less -p "sysctl -a" /etc/rc.d/initrandom`


----------



## Hornpipe2 (Jan 5, 2013)

Okay let's go through them, then.

[cmd="uname -a"]FreeBSD server.greg-kennedy.com 9.0-RELEASE-p3 FreeBSD 9.0-RELEASE-p3 #0: Tue Jun 12 01:47:53 UTC 2012     root@i386-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  i386[/cmd]

It's not a custom kernel or world, so rebuilding won't help.

Tried taking the sysctl -a call out of initrandom: still crashes after 30 mins.

/var/log/messages don't have anything useful.  There's no warning on these: just a flat out reboot.  There is this, in dmesg, but it's not that helpful: disabling ACPI still allows the problem.

[cmd="dmesg"]
acpi0: <DELL 8250   > on motherboard
acpi0: Power Button (fixed)
acpi0: reservation of 0, a0000 (3) failed
acpi0: reservation of 100000, f00000 (3) failed
acpi0: reservation of 1000000, 1ef6f000 (3) failed[/cmd]

Here's the minutes surrounding a reboot.
[cmd="cat /var/log/messages"]
...
Jan  4 19:37:44 server kernel: timecounter TSC must not be in use when changing
frequencies; change denied
Jan  4 19:38:15 server last message repeated 121 times
Jan  4 19:39:44 server last message repeated 356 times
Jan  4 19:39:44 server rpc.statd: Failed to contact host macmini.greg-kennedy.co
m: RPC: Port mapper failure - RPC: Timed out
Jan  4 19:39:44 server kernel: timecounter TSC must not be in use when changing
frequencies; change denied
Jan  4 19:40:54 server syslogd: kernel boot file is /boot/kernel/kernel
Jan  4 19:40:54 server kernel: Copyright (c) 1992-2012 The FreeBSD Project.
Jan  4 19:40:54 server kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989,
 1991, 1992, 1993, 1994
Jan  4 19:40:54 server kernel: The Regents of the University of California. All
rights reserved.
...
[/cmd]

---

I'm really thinking this might be one of those "format and forget it" situations.  Could climb to 9.1 at the same time, so it's not a total waste of time.


----------



## cpm@ (Jan 5, 2013)

> Jan 4 19:37:44 server kernel: timecounter TSC must not be in use when changing frequencies; change denied



Show output of timecounters available on your machine:

`# sysctl kern.timecounter.choice`

And show which is using right now:

`# sysctl kern.timecounter.hardware`

You should change tunable kern.timecounter.hardware to value i8254, to make permanent:

`# echo "kern.timecounter.hardware=i8254" >> /boot/loader.conf`



> Jan 4 19:39:44 server rpc.statd: Failed to contact host macmini.greg-kennedy.com: RPC: Port mapper failure - RPC: Timed out



Check you have added in /etc/rc.conf:

```
rpc_statd_enable="YES"
rpc_lockd_enable="YES"
```


----------



## Hornpipe2 (Jan 5, 2013)

[cmd=""]grkenn@server$ sysctl kern.timecounter.choice
kern.timecounter.choice: TSC(800) i8254(0) ACPI-fast(900) dummy(-1000000)
grkenn@server$ uptime
 9:25PM  up  1:45, 1 user, load averages: 0.10, 0.03, 0.01
grkenn@server$ sysctl kern.timecounter.hardware
kern.timecounter.hardware: ACPI-fast[/cmd]

Server has been up for 1:45 now, after taking sysctl -a out of initrandom.  Is that function called after 30 mins of uptime?  And, why in the world would sysctl -a force a reboot?

I do have those entries in rc.conf btw.

ETA:  what the heck, now running sysctl -a at command line is working flawlessly.  WHAT IS GOING ON


----------



## Beastie (Jan 5, 2013)

Hornpipe2 said:
			
		

> I have a suspicion of HDD corruption: fsck(1) found problems with /var/run, though it now marks the fs as "clean".  Suggestions?  Or should I just begin the backup-and-reformat procedure?





			
				Hornpipe2 said:
			
		

> ETA:  what the heck, now running sysctl -a at command line is working flawlessly.  WHAT IS GOING ON


I'm sorry if I misunderstood you, but you shouldn't wait for situations like this to make backups. If you haven't already, make backups ASAP.

Then check the disk with sysutils/smartmontools and test the memory for *at least a few hours* with sysutils/memtest86+.


----------



## wblock@ (Jan 5, 2013)

We have not heard anything about the actual hardware, although the ACPI message says "DELL 8250".  If this is a machine that is several years old, inspect the motherboard for visibly bad capacitors, bulging or leaking from the top or bottom.  Many older machines have that problem, and the symptoms can be intermittent errors.  The capacitors close to the processor are particularly important and failure-prone.  This can also be a problem with the capacitors inside power supplies.  Those are harder to see, but swapping power supplies is a good test.  Although of course Dell sometimes used proprietary pinouts on their power supplies for no good reason except to force customers to buy Dell power supplies...

Summary: when you have intermittent errors, suspect hardware.  As Beastie mentions, hard drives and memory are suspects, and power problems like those caused by failed capacitors are in the same league.


----------



## Hornpipe2 (Jan 5, 2013)

Reformatted and reinstalled 9.0-RELEASE (it's what I had on CD), then freebsd-update to 9.1-RELEASE.

Seems stable for now.  I'm not ruling out hardware issues.  If this happens again, I'll look into it.

BTW: this seemed like it might be relevant.  http://blog.e-shell.org/266


----------



## cpm@ (Jan 5, 2013)

Hornpipe2 said:
			
		

> Reformatted and reinstalled 9.0-RELEASE (it's what I had on CD), then freebsd-update to 9.1-RELEASE.
> 
> Seems stable for now.  I'm not ruling out hardware issues.  If this happens again, I'll look into it.
> 
> BTW: this seemed like it might be relevant.  http://blog.e-shell.org/266



About your curiosity for defined system configuration options in /etc/rc.conf, read rc.conf(5).

You should make frequent backups, it saves time and avoids be angry for similar future situations.

Enjoy with FreeBSD and his great Community!

Regards.


----------

