# Can't work out which disk to boot from



## pathiaki (Nov 27, 2012)

Hi,

I've tried searching the forums and search engines.  I can't find out what is wrong with 9.1-RC3.

I did the usual and used svn to get the latest source yesterday.

I did the usual (GENERIC has all the debugging and GRAID commented out - Why is that even in there?)


```
make buildworld
make buildkernel KERNCONF=GENERIC  
make installkernel KERNCONF=GENERIC
```

reboot into single user.....


```
zfs set readonly=off <my pool>

make installworld
```

(Everything seeemed ok)

I reboot.

Now it goes through the BIOS twice (from the look of it) and it hits the loader and says:


```
"Can't work out which disk we are booting from"
Guessed BIOS drive 0xffffffff not found by probes.  defaulting to disk0:
fal not found
```

and then it panics.

Anyone have a clue about this?  I have found numerous posts from previous years but they don't seem applicable or the solutions don't seem available.

This is on:

ASUS 990FX
16 GB Corsair
AMD 1090T
ZFS Mirror Boot (one on the internal SATA one on the MPI LSI SAS controller)
6 internal SATAs (one of the internals is part of the boot mirror)
12 drives on the MPI controller

I was attempting to go from 9.0 RELEASE to 9.1-RC3 on AMD64

Thank you,

Paul

PS - Sorry if I missed this in my searching of the forums, I'm kind of new here.


----------



## Beeblebrox (Nov 27, 2012)

As far as I remember btx does not do well when it comes to booting off multiple HDD.
Are you able to boot into any OS and show the layout of bot HDD? In FreeBSD it would be as below - if you have not assigned labels, the "-l" is unnecessary
`# gpart show -l`
There are multiple solutions so that is why I am asking what it is exactly you want to do.


----------



## pathiaki (Nov 27, 2012)

Hmmm....  I'll be happy to perform that, however, this was working correctly prior to my upgrade.  I had 9.0-RELEASE in this configuration and everything was working just fine.  I could boot from either of the mirrored pair using the BIOS boot selector and everything would initialize and work just fine.

I'm leaning more towards that, for once, I missed something in the UPDATING notes.  However, I'm also curious as to whether gptzfsboot needed to be upgraded and I didn't do it this time around.


----------



## pathiaki (Nov 28, 2012)

Hi Beeblebrox,

my *gpart show -l* is:


```
> gpart show -l
=>       34  488397101  ada0  GPT  (232G)
         34        128     1  (null)  (64k)
        162    8388608     2  (null)  (4.0G)
    8388770  480008365     4  (null)  (228G)

=>       34  488397101  ada1  GPT  (232G)
         34        128     1  (null)  (64k)
        162    8388608     2  (null)  (4.0G)
    8388770  480008365     4  (null)  (228G)

=>       34  250069613  ada2  GPT  (119G)
         34        128     1  (null)  (64k)
        162  249560960     2  (null)  (119G)
  249561122     508525        - free -  (248M)
```


The first two are my mirrored root disks.  The third is the 'emergency' disk I keep around.

I couldn't find anything amiss in my present configuration.  I assumed that it was either the loader.conf or something else that was screwing things up.

So....  I went extreme.  I mounted everything on the root drives with the following:


```
kldload zfs
zpool import -f
zfs set mountpoint=/mnt <pool>
zfs set mountpoint=/mnt/usr <pool>/usr
zfs set mountpoint=/mnt/var <pool>/var
```

(The reason for the above is that I have a large hierarchy of zfs mountpoints, however, the usual /, /usr, /var are where the OS lives.)

Once this was all there:


```
cd /mnt/usr/src/sys/amd64/conf
cp GENERIC* /mnt/root   
cd /mnt/usr/src
rm -rf *
svn checkout svn://svn.FreeBSD.org/base/stable/9 /mnt
setenv DESTDIR /mnt
cd /mnt/usr/src/sys/adm64/conf
vi GENERIC (get rid of the annoying, "blow everything up on every board RAID" GEOM_RAID)
cd /mnt/usr/src
make buildworld
make buildkernel KERNCONF=GENERIC
make installkernel KERNCONF=GENERIC
make installworld  (We can do this since not running on the OS in these subdirs right now)
cd /mnt/boot
mv loader.conf loader.conf.<date>
cp loader.conf.<date> loader.conf
vi loader.conf (strip out the world)
zfs set mountpoint=/ <pool>
zfs set mountpoint=/var <pool>/var
zfs set mountpoint=/usr <pool>/usr
reboot
```

BINGO!!!!  Everything is back.


```
uname -a

FreeBSD atlantisservices.net 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #0 r243641M: Wed Nov 28 09:48:43 EST 2012
     [email]root@atlantisservices.net[/email]:/usr/obj/mnt/usr/src/sys/GENERIC  amd64
```
We are on r243641M.  *whew*

(OK, I went to the config and backed up all my "GENERIC" kernels just in case.  I then went to the relative /usr/src and blew away the world.  I re-downloaded the src via subversion.  I rebuilt the world and kernel.  Moved my current loader.conf out of the way to make sure nothing was screwing up the loader.)

So....  there are 3 possibilities:  1) something is amiss in the loader.conf - I'll check and see if I can get this to repeat by putting the original back in place.  2)  There was something wrong with the previous "stable" version - I doubt it.  3)  Something is amiss when we load the version over the existing 9.0-RELEASE.  (I am really leaning that way.  I has this same thing happen when I loaded RC2 onto this box - I mean the EXACT SAME THING.  The same loader issue to a 'T'.  Heavy possibility of 1) and 3).  I'll write soon.  If it's 3), there should be an entry put in UPDATING about this.

Paul


----------



## pathiaki (Nov 28, 2012)

Hi,

It looks like it wasn't the loader.conf.  I put the old one back in place and the thing booted fine.

So, here's the history:

I attempted an upgrade to 9.1-RC2 on and amd64 machine from 9.0-RELEASE when 9.1-RC2 first came out.  I used the usual cvsup method at that time.  It blew up in the EXACT same manner as this incident.

I re-installed 9.0-RELEASE and everything was fine again.

Two days ago, I installed 9.1-RC3 on the "back at 9.0-RELEASE" machine and got the same issue with, what seemed to be the loader, not being able to figure out the ZFS mirrored root pool devices to boot.

There's only one thing left to try.  I will rebuild the kernel with version of 'GENERIC' that I use.  If that works, then this seems to point to a possible problem with the upgrade of 9.0-RELEASE to 9.1.

Paul


----------



## Beeblebrox (Nov 28, 2012)

Are doing the build (world + kernel) natively or are you importing the zpool and installing world with DESTDIR?
If you boot into FreeBSD environment from other media, import zpool's root and export it when you are done, the zpool will become "not ready" and the system cannot boot. If you are doing something like this, after you are finished and you have exported the zpool, you must re-import the pool without altrot flag. That's right, smack on top of the existing root of the alternative media - then you shut down.


----------



## pathiaki (Nov 28, 2012)

Hi,

I imported the pool and did a zfs set mountpoint=/mnt for my original mirrored root.  I then mounted my pool's /usr and /var under the /mnt.

It looked like:


```
/mnt
/mnt/usr
/mnt/var
```

After my svn, I set DESTDIR as /mnt.

I did my make world and make kernel and it installed in the /mnt hierarchy.

Now, onto my latest troubleshooting....  I did my minimalist GENERIC kernel.  With this configuration:


```
cpu             HAMMER
ident           GENERIC

options         SCHED_ULE               # ULE scheduler
options         PREEMPTION              # Enable kernel thread preemption
options         INET                    # InterNETworking
options         INET6                   # IPv6 communications protocols
options         SCTP                    # Stream Control Transmission Protocol
options         FFS                     # Berkeley Fast Filesystem
options         SOFTUPDATES             # Enable FFS soft updates support
options         UFS_ACL                 # Support for access control lists
options         UFS_DIRHASH             # Improve performance on big directories
options         UFS_GJOURNAL            # Enable gjournal-based UFS journaling
options         MD_ROOT                 # MD is a potential root device
options         MSDOSFS                 # MSDOS Filesystem
options         CD9660                  # ISO 9660 Filesystem
options         PROCFS                  # Process filesystem (requires PSEUDOFS)
options         PSEUDOFS                # Pseudo-filesystem framework
options         GEOM_PART_GPT           # GUID Partition Tables.
options         GEOM_LABEL              # Provides labelization
options         SCSI_DELAY=5000         # Delay (in ms) before probing SCSI
options         KTRACE                  # ktrace(1) support
options         STACK                   # stack(9) support
options         SYSVSHM                 # SYSV-style shared memory
options         SYSVMSG                 # SYSV-style message queues
options         SYSVSEM                 # SYSV-style semaphores
options         _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions
options         PRINTF_BUFR_SIZE=128    # Prevent printf output being interspersed.
options         KBD_INSTALL_CDEV        # install a CDEV entry in /dev
options         HWPMC_HOOKS             # Necessary kernel hooks for hwpmc(4)
options         AUDIT                   # Security event auditing
options         MAC                     # TrustedBSD MAC Framework

# Make an SMP-capable kernel by default
options         SMP                     # Symmetric MultiProcessor Kernel

# CPU frequency control
device          cpufreq

# Bus support.
device          acpi
device          pci

# ATA controllers
device          ahci            # AHCI-compatible SATA controllers
device          ata             # Legacy ATA/SATA controllers
options         ATA_CAM         # Handle legacy controllers with CAM
options         ATA_STATIC_ID   # Static device numbering

# ATA/SCSI peripherals
device          scbus           # SCSI bus (required for ATA/SCSI)
device          da              # Direct Access (disks)
device          cd              # CD
device          pass            # Passthrough device (direct ATA/SCSI access)
device          ses             # Enclosure Services (SES and SAF-TE)
device          ctl             # CAM Target Layer

# atkbdc0 controls both the keyboard and the PS/2 mouse
device          atkbdc          # AT keyboard controller
device          atkbd           # AT keyboard
device          psm             # PS/2 mouse

device          kbdmux          # keyboard multiplexer

device          vga             # VGA video card driver
options         VESA            # Add support for VESA BIOS Extensions (VBE)

device          splash          # Splash screen and screen saver support

# syscons is the default console driver, resembling an SCO console
device          sc
options         SC_PIXEL_MODE   # add support for the raster text mode

# device                agp             # support several AGP chipsets

# PCCARD (PCMCIA) support
# PCMCIA and cardbus bridge support
device          cbb             # cardbus (yenta) bridge
device          cardbus         # CardBus (32-bit) bus

# Serial (COM) ports
device          uart            # Generic UART driver

# Parallel port
device          ppc
device          ppbus           # Parallel port bus (required)


# PCI Ethernet NICs that use the common MII bus controller code.
# NOTE: Be sure to keep the 'device miibus' line in order to use these NICs!
device          miibus          # MII bus support

# Wireless NIC cards
device          wlan            # 802.11 support
options         IEEE80211_AMPDU_AGE # age frames in AMPDU reorder q's
options         IEEE80211_SUPPORT_MESH  # enable 802.11s draft support
device          wlan_wep        # 802.11 WEP support
device          wlan_ccmp       # 802.11 CCMP support
device          wlan_tkip       # 802.11 TKIP support
device          wlan_amrr       # AMRR transmit rate control algorithm
device          an              # Aironet 4500/4800 802.11 wireless NICs.
device          ath             # Atheros NICs
device          ath_pci         # Atheros pci/cardbus glue
device          ath_hal         # pci/cardbus chip support
options         AH_SUPPORT_AR5416       # enable AR5416 tx/rx descriptors
device          ath_rate_sample # SampleRate tx rate control for ath

# Pseudo devices.
device          loop            # Network loopback
device          random          # Entropy device
options         PADLOCK_RNG     # VIA Padlock RNG
options         RDRAND_RNG      # Intel Bull Mountain RNG
device          ether           # Ethernet support
device          vlan            # 802.1Q VLAN support
device          tun             # Packet tunnel.
device          pty             # BSD-style compatibility pseudo ttys
device          md              # Memory "disks"
device          gif             # IPv6 and IPv4 tunneling
device          faith           # IPv6-to-IPv4 relaying (translation)
device          firmware        # firmware assist module

# The `bpf' device enables the Berkeley Packet Filter.
# Be aware of the administrative consequences of enabling this!
# Note that 'bpf' is required for DHCP.
device          bpf             # Berkeley packet filter

# USB support
#options        USB_DEBUG       # enable debug msgs
device          uhci            # UHCI PCI->USB interface
device          ohci            # OHCI PCI->USB interface
device          ehci            # EHCI PCI->USB interface (USB 2.0)
device          xhci            # XHCI PCI->USB interface (USB 3.0)
device          usb             # USB Bus (required)
device          uhid            # "Human Interface Devices"
device          ukbd            # Keyboard
device          umass           # Disks/Mass storage - Requires scbus and da
device          ums             # Mouse

# Sound support
device          sound           # Generic sound driver (required)
device          snd_hda         # Intel High Definition Audio
device          snd_ich         # Intel, NVidia and other ICH AC'97 Audio
device          snd_via8233     # VIA VT8233x Audio
```

Everything is working fine... so far.  (Anything else I can strip out of the above?  )

loader.conf:


```
aio_load="YES"
accf_http_load="YES"
cc_cubic_load="YES"
if_re_load="YES"
vm.kmem_size_max="8192M"
opensolaris_load="YES"
zfs_load="YES"
vfs.root.mountfrom="zfs:gw"
vfs.zfs.prefetch_disable="1"
vfs.zfs.txg.timeout="5"
vfs.zfs.arc_max="6144M"
ipmi_load="YES"
mfi_load="YES"
mpt_load="YES"
mps_load="YES"
kern.ipc.shmmaxpgs=65536
kern.ipc.semmni=40
kern.ipc.semmns=240
kern.ipc.semume=40
kern.ipc.semmnu=120
kern.ipc.shm_use_phys=1
kern.sched.preempt_thresh=224
nullfs_load="YES"
tmpfs_load="YES"
geom_mirror_load="YES"
```

So, everything seems to point to the /usr/src on my machine not liking the svn of the latest stuff over the existing 9.0-RELEASE /usr/src that was on the machine.  Once I blew away /usr/src, re-did svn, and rebuilt everything, it works fine.  *shrug*  Is this a bug?

Paul


----------



## wblock@ (Nov 28, 2012)

pathiaki said:
			
		

> So, everything seems to point to the /usr/src on my machine not liking the svn of the latest stuff over the existing 9.0-RELEASE /usr/src that was on the machine.  Once I blew away /usr/src, re-did svn, and rebuilt everything, it works fine.  *shrug*  Is this a bug?



Yes, but it's the hardest kind of bug to fix, a bug in expectations.  Checking out over an existing directory is not likely to give expected results.  There will be files not under svn's control mixed in with the repository versions.  At best, it will be hard to clean up.

PS: please read http://forums.freebsd.org/showthread.php?t=8816 about using tags to make your messages easier to read.  For examples, see the edited post above.


----------



## Beeblebrox (Nov 28, 2012)

> After my svn, I set DESTDIR as /mnt.
> I did my make world and make kernel and it installed in the /mnt hierarchy.


That's exactly what I was trying to describe to you.
After you are done with everything and before shutting down, you export the pool don't you? That leaves the zpool in an exported state, whereas it needs to be in an "active" state in order to respond to btx. Before shutting down, you need to re-import the pool without atroot flag, then shut down. Try it and see - export a zpool with root on it, leave it like that and see if it is able to boot.


----------

