Wednesday, 25 January 2012

Joggler: How it boots

This is part of my series on running an unmodified Debian on the Joggler. See here for other posts on the same topic.

Okay, so the first thing that's important in running something other than the stock software is understanding exactly how the device boots normally, and at what points in that process you can easily/safely interfere.

The Joggler is mostly just a PC in a weird box: it has an Atom processor that runs normal x86 code. However, unlike most x86 computers, it doesn't have a traditional PC-compatible BIOS: it uses EFI. Some very modern PCs boot with EFI, but they generally have BIOS compatibility code that allows them to run traditional bootloaders and operating systems, and EFI is not widely used to actually boot on them just yet. The only platforms I know of where this is already used by large numbers of users are the x86-based Macs.

This means that the early stages of boot proceed very differently to the usual process on a PC, and to make things more annoying, the Joggler doesn't bother to display any hints as to what it's actually doing; the screen just displays a static logo on poweron (either the OpenPeak logo or the O2 logo, depending exactly which hardware variant you have) and leaves it there until the OS has booted far enough to take control of the display and display its own boot animation. As far as I know there's nothing you can press/do to see any actual boot progress; the logo is all you get.

Fortunately, other people already figured out pretty much everything it does during boot, though not all the details seem to be well-explained in the same place. To try and help out with that, here's my explanation of how it boots its factory-shipped OS. I've included notes that compare this to PC BIOS booting in red and notes on how you can encourage it to deviate from the normal boot sequence in blue.
  1. Power on. The EFI firmware lives in a small flash chip and the CPU starts executing here. It loads a bunch of EFI drivers for various hardware: the display driver, which has the boot logo embedded in it, USB host support, MMC/SD controller support (for the internal drive, which is an eMMC device), and various other hardware that's not interesting. It also loads a FAT filesystem driver, which EFI requires. I don't think the Joggler's EFI implementation supports any other filesystems. PC BIOS boots much the same way, but the drivers are not quite so well-defined, and it has as few of them as it can get away with. It also doesn't generally have any filesystem support. Interfering at this stage is possible: reflash the firmware chip with something else. But.. let's not. There's probably no way to recover from a bad flash without hotswapping the chip into another working Joggler, and the EFI implementation is not really a problem.
  2. The EFI boot manager runs. It has a list of boot choices stored in NVRAM; each entry specifies a device to look at and some kind of path on that device to find. For disk-like storage devices (the internal eMMC, a USB thumbdrive, whatever) this is pretty simple. Weirder kinds exist like network booting, but I have no idea if the Joggler supports them and I don't need them for anything. The entries have to refer to something that EFI knows how to run: usually an EFI binary, with extension .efi, compiled specifically for the firmware environment. It tries them in order until one is found. The Joggler's default boot choice list only has one valid entry: the internal EFI shell built into the firmware. PC BIOS has a priority order for what device to boot from, but booting from disks just loads the first sector of the disk into memory and jumps to it; no files are involved. There's also not really any sensible fallback option, since BIOS doesn't have a shell. Once booted by some other method, you can edit the boot choice list (bcfg in an EFI shell, efibootmgr in Linux). You can add new entries above the default shell choice, which will be tried first. You need to use an EFI binary. I'm not sure what the Joggler's implementation of EFI will do if none of the entries are valid; in theory it should fall back on a sensible default (running various fixed paths from any available disk) but I don't particularly want to test this in case I end up with a brick.
  3. The builtin EFI shell runs. It doesn't actually display anything on the screen on the Joggler, but if it did, it would show a countdown prompt for several seconds. If you press ESC on a USB keyboard during this countdown, the boot process stops and you get interactive access to the shell, though you still can't see anything so you can only blindly type commands. If you don't abort the countdown, it looks for a file called startup.nsh on the available disks, and executes each line as a shell command. This redirects input such that all EFI input from that point onwards comes from startup.nsh, not the keyboard. PC BIOS doesn't have any equivalent to this; control proceeds from the BIOS directly to the MBR code on the chosen device. You can't take over boot automatically at this point because startup.nsh will be found on the internal disk first; it will only load startup.nsh from a USB disk if there isn't one on the internal disk any more. You can hit ESC and boot anything you like manually, though, as long as you can type blind.
  4. If you haven't interfered then the shell will find startup.nsh on the internal eMMC's first partition, which is a FAT partition. This runs boot.efi in the root of the same partition, which is the Joggler's retail bootloader. That's just an EFI binary that loads their Linux (kernel is stored on the FAT partition, root filesystem is another partition on the eMMC, both exist twice probably to facilitate safe upgrades), and it's not really interesting. On PC BIOS, this is roughly equivalent to the MBR chainloading the partition boot sector and then the partition boot sector chainloading a real bootloader. You take over at this point by putting a boot.nsh on a USB disk, because the startup.nsh checks on fs1: (the second drive detected) for "boot" before it checks fs0:. This is the method most Joggler hacking docs suggest; however, if the thing you load expects to be able to get keyboard input from EFI it won't be able to: input is still redirected to the script. This means you can't, for example, actually use the menu in GRUB, or type anything into an EFI shell you manually invoke this way. So, this has limited use.
That's basically it. Most people appear to be interfering with boot by providing boot.nsh on a USB device and booting elilo or grub2-efi from there. This works, but is kinda sucky because you can't interact with the bootloader (to boot a different kernel, or to manually type command line arguments) unless you hit ESC during boot and blind-type the EFI shell commands to invoke your Linux bootloader.

The optimal solution is to install a "real" bootloader somewhere (one that lets you choose what to boot), either on the internal memory or on a USB device, and then insert a boot manager entry at the top of the list which boots that before it tries the internal shell. This means you don't get your input redirected: a connected USB keyboard will always work in the bootloader. You can still boot the original Joggler software from here if you want, too, since EFI programs can load other EFI programs: they just need to chainload the original boot.efi.

"Which real bootloader to use, though? And where to get a binary?" you ask. Well, I'll cover that in a future post.