# "Manual root filesystem specification" on install



## semafoor (Oct 14, 2013)

Hey everybody.

I'm having a problem with a new box. I've created a gmirror volume on some 3 TB disks. I manually partitioned it from a FreeBSD 9.2 memory stick using gpart from the fixit console, as I wanted to clone it from another machine using a dump.

My problem is, the box won't boot automatically. I end up having to manually specify the root filesystem.


```
Loader variables:

Manual root filesystem specification:
  <fstype>:<device> [options]
      Mount <device> using filesystem <fstype>
      and with the specified (optional) option list.

    eg. ufs:/dev/da0s1a
        zfs:tank
        cd9660:/dev/acd0 ro
          (which is equivalent to: mount -t cd9660 -o ro /dev/acd0 /)

  ?               List valid disk boot devices
  .               Yield 1 second (for background tasks)
  <empty line>    Abort manual input

mountroot> ?
List of GEOM managed disk devices:
  gptid/223fda50-34e1-11e3-8baa-10bf48e38fab gpt/root ufs/root ufsid/525c06db13a9bdb7 gptid/1d8390d2-34e1-11e3-8baa-10bf48e38fab gpt/swap gptid/15dc4e84-34e1-11e3-8baa-10bf48e38fab gpt/boot mirror/build0p3 mirror/build0p2 mirror/build0p1 mirror/build0 ada1 ada0

mountroot> ufs:mirror/build0p3
Trying to mount root from ufs:mirror/build0p3 []...
```

As you can see, it does see my gmirror device, and when I enter it manually, it boots normally. However, it seems that the loader does not parse my /etc/fstab correctly.

The same procedure worked countless times on 8.x, and I can't from the handbook decode if something has been changed to the loader. Can it be that I have to give a hint to the loader somewhere?


```
# gmirror status
          Name    Status  Components
mirror/build0   COMPLETE  ada0 (ACTIVE)
                          ada1 (ACTIVE)

# gpart show
=>        34  5860533100   mirror/build0  GPT  (2.7T)
          34           6                  - free -  (3.0k)
          40         128               1  freebsd-boot  (64k)
         168     8388608               2  freebsd-swap  (4.0G)
     8388776  5852144352               3  freebsd-ufs  (2.7T)
  5860533128           6                  - free -  (3.0k)

# cat /etc/fstab

# Device                Mountpoint      FStype  Options         Dump    Pass#
/dev/mirror/build0p2    none            swap    sw              0       0
/dev/mirror/build0p3    /               ufs     rw              1       1
/dev/cd0                /cdrom          cd9660  ro,noauto       0       0
proc                    /proc           procfs  rw              0       0

# cat /boot/loader.conf
# loadable modules
accf_http_load="YES"
accf_data_load="YES"
geom_mirror_load="YES"
geom_eli_load="YES"
```


----------



## wblock@ (Oct 14, 2013)

There are multiple problems.  GPT puts a backup partition table at the end of the disk.  gmirror(8) also stores metadata there.  Neither wins.  Net result: use MBR for gmirror(8).  The Handbook gmirror section talks about this in the introduction.  See the end of the Metadata Issues section.

MBR itself is not made to deal with devices larger than 2 TB.

The loader does more strict tests for validity of disk structures on later versions of FreeBSD.  The tests can be disabled: http://www.freebsd.org/releases/9.0R/relnotes-detailed.html#UPGRADE.


----------



## J65nko (Oct 15, 2013)

You could try to add this to /boot/loader.conf:

```
vfs.root.mountfrom='ufs:mirror/build0p3'
```


----------



## kpa (Oct 15, 2013)

I think the real problem is the order of the filesystems in fstab(5), try putting the root filesystem as the first line in the file, now it's the second. I vaguely recall that the boot loader requires the root file system to be the first entry.


----------



## semafoor (Oct 15, 2013)

Thanks everyone 

Yeah, I'm aware that gmirror and GPT conflict with each other, from the point of view of an environment that is not gmirror-aware. But for this situation, I quickly need a super big volume, and I haven't played enough with ZFS yet to use that in production.

I followed all the leads that you guys gave; for now, it has helped to add 
	
	



```
vfs.root.mountfrom="ufs:/dev/mirror/build0p3"
```
 to /boot/loader.conf. Now it boots through at once.

I will look at it more, _be_cause I'm not sure why. Maybe the new loader does not find partitions directly in a raw volume? I didn't use slices in this case, that might be a "weird" scenario. I've never seen this happen, but then again I hardly use GPT.

For completeness: switching off GEOM checks, reordering /etc/fstab, or using GPT labels (e.g. /dev/gpt/root) did not change anything, by the way.

Again, thanks again


----------



## jb_fvwm2 (Nov 9, 2014)

I'm curious as to whether this thread is the unofficial method of making a gmirror from GPT disks. And if so, if it could be written into a new device driver; something like gm0-GPT. And if so, if it could be configured to place multiple metadata files to read from somewhat like newfs does.


----------



## wblock@ (Nov 9, 2014)

The unofficial way to use GPT and gmirror(8) is by mirroring GPT partitions.  Preferably only one per disk.

However, any of these situations mixing the two are not widely used and not widely tested.  Because of that, they do not have the degree of safety as high as the standard MBR configuration.

This is not exactly a bug.  On any random disk, there are only two known locations: the beginning and the end.  GPT puts a primary partition table at the beginning of the disk, and a secondary, backup partition table at the end of the disk.  GPT is not meant to be used inside another partitioning scheme, either.  That secondary partition table really is meant to go at the end of the device.  GEOM providers like gmirror(8) put their metadata at the end of their data area, but that does not need to be a whole disk.  So it's possible to have a GPT-partitioned disk with gmirror() devices inside it.  The other way around, both want to use that last block of the disk.  If the secondary GPT table is overwritten by gmirror() metadata, no big deal.  If the gmirror() metadata is overwritten by a recovered or repaired GPT table, the mirror is damaged.

Incidentally, the GPT boot code could be modified to look for a secondary GPT table inside another container rather than at the physical end of the device.  In fact, Hiroki Sato posted an experimental patch that did just that a couple of years ago.  But a disk formatted that way no longer complies with the GPT specifications, which state that the secondary table goes at the end of the device.  Some people might not be too concerned about this, but it turns out that FreeBSD leans more toward strict compatibility with standards than with the looser compliance of other operating systems.

Summary: gmirror(8) and GPT on full disks might compromise the increased data integrity that is the point of using a mirror.  If the disks are 2 TB or less, use gmirror(8) and create an MBR partitioning scheme on the mirror.

If the disks are larger than 2 TB, and enough memory is available, a ZFS mirror offers increased safety.

If you really insist, gmirror(8) mirrors can be created between GPT _partitions_ on different disks.  It would be best to limit this to a single mirror to avoid the massive head contention of rebuilding multiple mirrors on the same drives.

If you seriously super-really insist, GPT-partitioned drives can be mirrored with gmirror(8) by overwriting the secondary GPT table.  This adds some risk.  There is no backup of the primary GPT table.  More seriously, the mirror metadata goes right where the secondary GPT is expected.  The boot code normally complains about a corrupted backup table.  If there is more than one sysadmin, the others must be told to ignore this error, because "repairing" it will overwrite the mirror metadata on one or both drives.  Some time after that there will be a "we're doomed, dooooomed!" scenario, which I personally prefer to avoid.


----------



## jb_fvwm2 (Nov 9, 2014)

I had a disk GPT that could get to the Beastie menu but not to the filesystems, only the mountroot prompt, only an _ada0_ present.  No `gpart show` any longer because the GPT both were corrupted. Searching the shell history I found where many months ago the commands to create a similar disk were utilized, so I rewrote the boot MBR GPT, bootcode, and the disk was not bootable AND the Beastie menu gone.  Again, I rewrote the boot MBR GPT, bootcode and gpart commands to create the filesystems and swap, and the p# partitions were created.  I tried mounting those, they had labels, but were not mountable until a reboot. I rebooted. Of course, per procedure the next step was to newfs into filesystems, but since the exact create-gpt commands were used for sizing, they had automagically been already-newfs'd and could be mounted, and all the data was recovered.  (This transpired since the post above posing the question.)  So maybe gpart could be enhanced to search for the existing filesytems' newfs embedded data, and create the GPT partitions backwards from that, somewhat like editing legacy bsdlabel tables into a sysinstall bottom-menu next-todo semi-installer.

And thanks for that information about mirroring GPT partitions etc.


----------



## wblock@ (Nov 9, 2014)

A GPT disk has a "protective" PMBR in the standard MBR area.  Do not modify that PMBR.  If you want to repartition, either modify the GPT partitions or back up, `gpart destroy` the GPT scheme, and `gpart create -s mbr` to create a real MBR partitioning scheme.

That destroy step is important.  Just creating a new MBR without destroying the GPT scheme leaves the backup GPT table in place, and confusing error messages result.


----------



## jb_fvwm2 (Nov 9, 2014)

Well, I'm even more encouraged or discouraged.  While the uncoventional skipped-steps method above saved data at that point in time, its replacment has failed catastrophically, despite being made in the usual conventional manner. At first I could not mount root, I restored from its backup gpart file, gpart  showed it as normal status, rebooted, and now the new disk not only is not bootable, it halts any machine booting even attached as secondary.

I'm encouraged by the possibiliities of gpart to be enhanced, yet discouraged by this new type of failure I've not seen before.


----------



## rmoe (Nov 11, 2014)

Sorry, but this whole issue is irritating and creating lots of confusion and trouble. And frankly, in my mind's eye the responsibility is clearly in the FreeBSD guys' corner.

wblock@ is perfectly right; with hard drives only two points are clearly known, the start sector (S) and the last sector (L) - but so is L - x, x being a GPT table size. And we're not talking about significant sizes here.

So, how about

having a "RAID info" GPT partition type (to keep things orderly) and then...
autocreating a RAID info partition at the end of the disk at a position that leaves enough (small anyway) space for the backup GPT table on GPT partitioned disks when a GEOM RAID is created;
having a new switch ("-2, --new-type") for the next gpart version (to stay perfectly compatible with existing installations)
Being consistent is one of the strengths of FreeBSD and I don't see a reason to break that unwritten law with GPT/software RAID.

Following the above approach all existing installations would continue to work without problems and for new installations we tell the users to use the new gpart switch to create the RAID information not in competition with GPT but right before GPT's backup table and properly in a new GPT partition type (so, for instance, admins could see right away what that chunk of data is and other eventual tools may properly work with it).


----------



## wblock@ (Nov 12, 2014)

It might be possible to create a new GEOM class, say gptmirror, that keeps metadata before a GPT backup table.  Offhand, I don't know whether that can be compliant with the standards, or whether that could be extensible for general use.  Ask on the freebsd-fs mailing list.


----------



## jb_fvwm2 (Nov 12, 2014)

Not five minutes ago I though it would be nice if gpart were enhanced to be able to optionally put its secondary copy almost directly after its first one, leaving the end of the disk free for graid5 and gmirror, eventually fitting nicely into a sysinstall where one has a visual indication of the disk partitions and setup before the OS install. That was a huge convenience back a decade ago (the latter concept that is).


----------



## wblock@ (Nov 12, 2014)

This is not a gpart(8) thing.  It supports numerous types of formatting, including MBR and even goofier formats used on non-x86 platforms.  Putting the two copies of the GPT table near each other negates the additional safety that two copies provide.  The end of the disk might survive if the start fails, and the other way around.  But that is secondary, because locating the backup GPT table anywhere other than the end of the disk breaks the GPT specifications.  That's important.  Other GPT-compatible tools will not be able to find it if it is not in the location defined by the standard.


----------



## rmoe (Nov 12, 2014)

wblock@ said:


> This is not a gpart(8) thing.  It supports numerous types of formatting, including MBR and even goofier formats used on non-x86 platforms.  Putting the two copies of the GPT table near each other negates the additional safety that two copies provide.  The end of the disk might survive if the start fails, and the other way around.  But that is secondary, because locating the backup GPT table anywhere other than the end of the disk breaks the GPT spec.  That's important.  Other GPT-compatible tools will not be able to find it if it is not in the location defined by the standard.




Yes, you're perfectly right in both points. How we (FreeBSD) do software RAID is our business but GPT is a standard that must be respected. Whatever we come up with must be about our software RAID.


----------



## kpa (Nov 12, 2014)

rmoe said:


> Yes, perfectly right in both points. How we (FreeBSD) do software raid is our business but GPT is a standard that must be respected.
> Whatever we come up with must be about our software raid.



In my opinion the GPT standard has a problem but we can't do much about it. I've seen many systems where the geometry of the disk changes when moved, let's say from an external (USB/Firewire) enclosure to the internal controller, resulting in the disk being detected with a different size in LBA blocks. This of course shows as a "corrupted GPT backup header" message when the second copy of the GPT table is not where it's expected to be. This is of course also a problem with disk controller manufacturers and their quality control, such changes in the disk geometry should not be possible when moving a disk from one controller to another.


----------



## jb_fvwm2 (Nov 12, 2014)

An enhancement to graid3... graid5... gmirror...  that detects a GPT header at the end of the disk and places its header before it instead of at the end?


----------



## rmoe (Nov 12, 2014)

kpa said:


> In my opinion the GPT standard has a problem but we can't do much about it. ... This is of course also a problem with disk controller manufacturers and their quality control, such changes in the disk geometry should not be possible when moving a disk from one controller to another.



Yep. And let me guess who were knee-deep involved in any standard whatsoever even remotely disk related. Similarly - and probably not coincidentally - other related areas (boot loaders, GRUB, anyone?) remind me more of a war zone (or a mental asylum?) than of seasoned professionals at work. Speaking of boot loaders, once more kudos to the BSD guys who quite generally _are acting_ like seasoned professionals.

Nevertheless, we must live with that standard, preferably in a smart way and keeping the original concept in mind. Breaking it with no matter what excuse and "good reasons" usually aggravates the problem rather than to solve it.



jb_fvwm2 said:


> An enhancement to graid3... graid5... gmirror...  that detects a GPT header at the end of the disk and places its header before it instead of at the end?



I'd rather not. Disk related stuff should be solved in a manner that is very low level and very close to the boot process. Considering that GPT offers plenty of partitions, and that GPT, at least for a good part, is pre kernel I'd prefer to go the partition way. One major advantage is that this is something that GPT understands; it's not something special, just yet another partition albeit a very small and somewhat exotic one (quite possibly only known to the FreeBSD GPT implementation). And a partition is the kind of "data blob" that happens to be used and understood and worked with at very early boot stages.


----------

