Booting a modern PC is a rather complex affair with many moving parts. While all individual parts are described in create detail, the overall process is less well documented. In this blog post I will attempt to provide a high-level overview, describing the components involved, their interfaces, and providing links to more detailed documentation about the inner workings.
Since this is a huge subject I’ll be limiting myself to describing the happy flow of a modern Intel x86-64 PC booting into Debian. The overall process of booting an AMD PC is slightly different, but the overall flow is the same. Although I focus on Debian, the process is the same for Debian-derivates (Ubuntu, Mint, etc.) and mostly applies to other distributions (RedHat, Gentoo) as well.
Modern CISC processors such as Intel x86-64 don’t implement all their functionality in hardware, instead some of it is implemented in special software called microcode. Older CPUs would come with microcode preloaded, however modern CPUs need to load their microcode after every reset.
Before starting up (coming out of reset) the processor will load the Firmware Interface Table by looking at a fixed place in memory mapped to the ROM containing the firmware. It will find the correct microcode for its stepping and load it.
This mechanism allows the microcode to be updated in case bugs are discovered and motherboards to support CPUs which didn’t exist yet when the board was produced by flashing a update1.
The CPU then proceeds to load the firmware embedded in a ROM on the motherboard. On PCs the firmware is usually referred to as the BIOS, however this refers to a specific type of firmware which is no longer used2. Modern PC use firmware which complies with UEFI.
The firmware first initialises all hardware, usually starting with DRAM. Because the firmware is tied to the specific motherboard it will be able to initialize all on-board components. For other component UEFI specifies how to load device drivers. The UEFI specification is over 2500 pages and defines interfaces for hardware such as storage media, USB, even Bluetooth.
UEFI itself actually specifies a rather complete operating system including the ability to execute programs, read data from partitions, communicate using HTTP, and it even includes a virtual machine. The main thing that sets is apart from (modern) operating systems is that is doesn’t support multi-threading.
The firmware scan for storage formatted using GUID Partition Table. The GPT contains the unique GUID identified that particular storage, as well as a list of partitions. A partition is identified by a unique GUID and is of a certain type (also a GUID) with optionally a human readable name. The firmware looks for partitions of type EFI System Partition. The ESP is formatted using FAT and contains EFI executables.
Usually the ESP is mounted on
# ls /boot/efi/EFI BOOT debian
The firmware then consult NVRAM to determine which executable on which ESP to start.
The NVRAM can be queried using efibootmgr:
# efibootmgr -v BootCurrent: 0000 Timeout: 1 seconds BootOrder: 0000 Boot0000* debian HD(1,GPT,0e11b308-ba4f-4e8d-a694-e417e541b945,0x800,0xee000)/File(\EFI\debian\shimx64.efi)
The firmware will then start a program called shim which will then load the actual bootloader. The purpose of the shim is to allow booting Linux on systems which enforce SecureBoot and only allow programs signed by Microsoft to be executed. The shim is signed by Microsoft in order to allow these PCs to still boot other operating systems, such as Linux, providing the Machine Owner consents.
Going into the details of SecureBoot is out of scope for this post, as this topic is large enough to deserve a (series of) post(s) by itself.
shim will start the actual bootloader founder in the same directory, which is GRUB.
GRand Unified Bootloader or GRUB will read its configuration file
grub.cfg in the same directory3. This config file will instruct GRUB to find the boot partition and load
/boot/grub/grub.cfg. This file in turn instructs GRUB to load all kinds of extensions from
/boot/grub/x86_64-efi as well as provide a menu for the user to choose from. If the user doesn’t provide input GRUB will select the first entry and attempt to boot.
Loading Linux is not trivial and requires following the Linux/x86 Boot Protocol which includes copying different bits of the kernel image to different places in memory based on the real-mode kernel header as well as updating the same header with information about the bootloader used and its version, whether bare-metal of a virtual machine, location of the command line, and a number of memory locations. In addition even more configuration has to be provided in a piece of memory called Zero Page.
Next GRUB will load the initramfs into memory and record its position in the kernel header. The initramfs is an archive in CPIO format containing files which populate the initial file system. Actually it can consist of multiple archives concatenate with optional compression.
Finally GRUB hands over control to the Linux kernel.
One of the first things Linux attempts to do is update the microcode of the CPU using the Linux Microcode Loader. This may seem odd, because loading the microcode is the first thing that happens even before the firmware. However imagine a scenario in which the firmware contains an old, buggy copy of microcode. One that could potentially cause a crash during boot. By attempting to update the microcode the system might be able to boot successfully.
Linux attempts to find the microcode in the first archive making up the initramfs. This archive must be uncompressed and the microcode must be located at
/kernel/x86/microcode/GenuineIntel.bin. If the provide microcode matches the current CPU and its version is higher it will be loaded.
Next Linux will initialize all kinds of hardware, reinitializing hardware already initialized by the UEFI firmware, and cleaning the code left in memory by the firmware using hooks provided to the kernel by the firmware specifically for this purpose.
After initializing the kernel will uncompress the initramfs archive and use it to populate the initial file system. Finally the kernel will run
The init script in raminitfs is a normal shell script (interpreter included in the initramfs in the form of busybox. The script will set-up RAID, mount encrypted volumes (prompting the user for a password if necessary), and initializing the Local Volume Manager or LVM.
Finally the script will call the actual init system by executing
/sbin/init on the actual root partition.
The default init system on Debian is systemd and it is responsible for starting consoles, setting up udev, loading kernel modules, initializing additional hardware, loading daemons, and (if applicable) starting X Windows.
Describing just the basics of how systemd works requires a blog post in itself.
In this post I have attempted to give a high level overview of the boot process. In doing so I have skipped over many many details. However, hopefully, I managed to capture the essence of the process and provided some new insight to the reader.
If I missed a major step in the process or my post contains factual errors, please do get in touch using one of the methods listed at the bottom of this page.
Now that you reached the end of this post, if you feel that this whole chain shim –> GRUB –> Linux is a bit complex and wonder whether it cannot be done in an easier way. The answer is: yes, it can, but that is a topic for a future post.
Updating a motherboard to support a newer CPU provides an interesting challenge: the motherboard cannot boot the CPU without an update, but the update can only be installed when booted up. This is usually resolved by booting with an older CPU, then flashing the update, and then installing the new CPU. ↩
Many manuals, including those by Intel, still refer to the firmware as the BIOS. ↩
When reading about GRUB beware that GRUB also support ‘classic’ or MBR style systems, when you read about things like stage 1 and stage 1.5, MBR, and BIOS boot partition these refer to the MBR style boot and are not relevant for (U)EFI. ↩