# fsck segfaults



## wally_360 (Oct 17, 2013)

Our fileserver panic'ed tonight, after it came back up, it couldn't background `fsck` one of the file systems. I brought it back up by commenting that filesystem out. 

Now I am trying to do an `fsck` on the filesystem, but it segfaults.


```
root@mercury:/var/log # fsck /dev/da0p1
fsck: Could not determine filesystem type

root@mercury:/var/log # fsck -t ufs /dev/da0p1
** /dev/da0p1
fsck: /dev/da0p1: Segmentation fault: 11
```

It's running 9.2-RELEASE.


```
uname -a 
FreeBSD mercury 9.2-RELEASE FreeBSD 9.2-RELEASE #0 r255898: Thu Sep 26 22:50:31 UTC 2013     [email]root@bake.isc.freebsd.org[/email]:/usr/obj/usr/src/sys/GENERIC  amd64
```

The filesystem is on a LSI RAID Card, 9750-24i4e. The controller says everything is fine.

I tried looking at the core file, but it doesn't mean anything to me.


```
# gdb core fsck_ufs.core 
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...core: No such file or directory.

Core was generated by `fsck_ufs'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000405cce in ?? ()
(gdb) bt
#0  0x0000000000405cce in ?? ()
#1  0xffffffff90000000 in ?? ()
#2  0x000000000040ef6b in ?? ()
#3  0x0000000000624e10 in ?? ()
#4  0x000000080061f8cb in ?? ()
#5  0x0000007b7100ff00 in ?? ()
#6  0x00000000000121a0 in ?? ()
#7  0x0000007b00000005 in ?? ()
#8  0x00000000525f4c3b in ?? ()
#9  0x00000000021036a8 in ?? ()
#10 0x00000000525f4c3b in ?? ()
#11 0x00000000021036a8 in ?? ()
#12 0x00000000525f4c3b in ?? ()
#13 0x00000000021036a8 in ?? ()
#14 0x0000000000000000 in ?? ()
(gdb) quit
```


----------



## kpa (Oct 17, 2013)

The partition da0p1 is most likely the freebsd-boot partition that is not an UFS filesystem at all. The da0p2 partition is probably your UFS filesystem.


----------



## wally_360 (Oct 17, 2013)

It is not the boot partition, that's on another RAID set (/dev/raid/r0p2).

This is a large RAID 6 array that is mounted to /sam.


```
more /etc/fstab
# Device        Mountpoint      FStype  Options Dump    Pass#
/dev/raid/r0p2  /               ufs     rw      1       1
/dev/raid/r0p3  none            swap    sw      0       0
#/dev/da0p1     /sam            ufs     rw      2       2  << problematic right now
```


----------



## jb_fvwm2 (Oct 17, 2013)

[cmd=] fsck_ffs -y [/cmd]; but have you tried single user mode?


----------



## wally_360 (Oct 17, 2013)

I just tried single user, and get the same thing.

I tried booting the 9.2-RELEASE Boot-only CD, and ran it in LiveCD mode. It still segfaulted.

I found reference to softupdates causing `fsck_ufs` to segfault. So I disabled softupdates.


```
# tunefs -j disable /dev/da0p1
Clearing journal flags from inode 4
tunefs: soft updates journaling cleared but soft updates still set.
tunefs: remove .sujournal to reclaim space
```

But it still segfaults.

```
root@mercury:/usr/home/shawn # fsck_ufs /dev/da0p1
** /dev/da0p1
Segmentation fault (core dumped)
```


----------



## zspider (Oct 17, 2013)

wally_360 said:
			
		

> I found reference to softupdates causing `fsck_ufs` to segfault. So I disabled softupdates
> 
> `# tunefs -j disable /dev/da0p1
> Clearing journal flags from inode 4
> ...



That should only be an issue if you're running CURRENT, which isn't suitable for a production environment. 9.2-RELEASE/RELENG does not have that issue as far as I can tell.

The `fsck` failing could be a sign of a more serious issue with your install, though without more information there is little I can do.


----------



## wally_360 (Oct 17, 2013)

zspider said:
			
		

> That should only be an issue if you're running CURRENT, which isn't suitable for a production environment. 9.2-RELEASE/RELENG does not have that issue as far as I can tell.



Yeah, I am just getting desperate. This is a production server and being down when everyone comes in tomorrow will be very very bad.

Any suggestions?



			
				zspider said:
			
		

> The `fsck` failing could be a sign of a more serious issue with your install, though without more information there is little I can do.



What additional information can I get for you?


----------



## zspider (Oct 17, 2013)

wally_360 said:
			
		

> Yeah, I am just getting desperate. This is a production server and being down when everyone comes in tomorrow will be very very bad.
> 
> Any suggestions?



Well you could try this, `ldd /sbin/fsck_ufs`. This will show all the libraries `fsck_ufs` is depending on, if there are missing libraries it will say so.


----------



## wally_360 (Oct 17, 2013)

zspider said:
			
		

> Well you could try this,
> 
> `ldd /sbin/fsck_ufs`
> 
> This will show all the libraries `fsck_ufs` is depending on, if there are missing libraries it will say so.




```
root@mercury:/mnt # ldd /sbin/fsck_ufs
/sbin/fsck_ufs:
	libufs.so.6 => /lib/libufs.so.6 (0x800834000)
	libc.so.7 => /lib/libc.so.7 (0x800a38000)
```


----------



## zspider (Oct 17, 2013)

wally_360 said:
			
		

> `root@mercury:/mnt # ldd /sbin/fsck_ufs
> /sbin/fsck_ufs:
> libufs.so.6 => /lib/libufs.so.6 (0x800834000)
> libc.so.7 => /lib/libc.so.7 (0x800a38000)`



Ok, try checking your PATH variable:  `echo $PATH`. On my testing VM I have:

```
/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/root/bin
```

In that order preferably, or else it's going to try to load whichever `fsck_ufs` it finds first.


----------



## wally_360 (Oct 17, 2013)

```
root@mercury:~ # echo $PATH
/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/root/bin
```

I grabbed the fsck_ufs binary off of a 9.0-RELEASE box and that isn't segfaulting. Not fixing things right now.

I can also mount the filesystem read-only, just fine.


----------



## zspider (Oct 17, 2013)

wally_360 said:
			
		

> `root@mercury:~ # echo $PATH
> /sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/root/bin`
> 
> I grabbed the fsck_ufs binary off of a 9.0-RELEASE box and that isn't segfaulting. Not fixing things right now.
> ...



Maybe the binary just got bitrot/clobbered, it's unlikely but it's not impossible either.


----------



## wally_360 (Oct 17, 2013)

zspider said:
			
		

> Maybe the binary just got bitrot/clobbered, it's unlikely but it's not impossible either.



I thought that too, but then I booted off the LiveCD and ran it, same thing.


----------



## zspider (Oct 17, 2013)

wally_360 said:
			
		

> I thought that too, but then I booted off the LiveCD and ran it, same thing.



That is strange, you should try a 9.1 binary too, if you can.


----------



## Zare (Oct 17, 2013)

Please run `truss fsck /dev/da0p1` and paste the last lines of output here.


----------



## jb_fvwm2 (Oct 17, 2013)

One other possibility, I vaguely recall such a problem if the entries in lost+found were too numerous prior to the `fsck`, and recall removing them with one of the more seldom-used disk data tools, the name of which I've forgotten.


----------



## wally_360 (Oct 17, 2013)

```
write(2,"\n",1)					 = 1 (0x1)
sigprocmask(SIG_BLOCK,SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2,0x0) = 0 (0x0)
sigprocmask(SIG_SETMASK,0x0,0x0)		 = 0 (0x0)
sigprocmask(SIG_BLOCK,SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2,0x0) = 0 (0x0)
sigprocmask(SIG_SETMASK,0x0,0x0)		 = 0 (0x0)
sigprocmask(SIG_BLOCK,SIGHUP|SIGINT|SIGQUIT|SIGKILL|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2,0x0) = 0 (0x0)
sigprocmask(SIG_SETMASK,0x0,0x0)		 = 0 (0x0)
process exit, rval = 1
```

I was able to run the binary from a 9.0-RELEASE machine, which was able to fix and repair the file system. 

Even on the clean filesystem, the 9.2-RELEASE binary fsck_ufs segfaults.



			
				jb_fvwm2 said:
			
		

> One other possibility, I vaguely recall such a problem if the entries in lost+found were too numerous prior to the `fsck`, and recall removing them with one of the more seldom-used disk data tools, the name of which I've forgotten.



While running the fsck_ufs from 9.0, it had to create the lost+found directory.


```
NO lost+found DIRECTORY
CREATE? [yn] y
```


----------

