"To Know Me Better, Try Walk in My Boots", Said GNU/Linux

| Comments

A quick go through to the GNU/Linux boot process

This post talks (quickly) about how a typical Linux (PC) system boots. I’ll try avoiding much technical terms. It’ll be a quick walk without diving much deeper.

The booting process of a system involves several steps.
When the computer first powers up, the code in its ROM is executed. This code generally called the firmware has some knowledge of the system hardware. On PCs this initial boot code is generally called the BIOS (Basic Input/Output System). A PC can have several BIOS levels for machine itself, graphic cards, network cards etc. In the context of booting, the BIOS generally perform two operations:
  • It figures out which device to boot from
  • And loads a program from the bootable sector from bootable device which tell the secondary boot loader to load

The MBR loading process then takes place.

MBR is the boot sector on hard disks. MBR(Master Boot recorder) is the first sector on the disk which have details about
  • Primary boot loader code(This is of 446 Bytes)
    The first 446 bytes of MBR contain the code that locates the partition to boot from. The rest of booting process takes place from that partition. This partition contains a software program for booting the system called the ‘bootloader’.
  • Partition table information(64 Bytes)
    MBR contains 64 bytes of data which stores Partition table information such as
    - what is the start and end of each partition
    - size of partition
    - type of partition(Whether it’s a primary or extended etc)
    It requires 16 Bytes of space for one partition. So at most we get 4 primary partitions.
  • Magic number(2 Bytes)
    The magic number service as validation check for MBR. If MBR gets corrupted this magic number is used to retrieve it.

GRUB loading

GRand Unified Bootloader is the boot loader for most Linux distributions. Its job is to load a kernel from a pre-prepared list of kernels with options specified. The GRUB has two versions
- GRUB legacy
- GRUB 2

GRUB works in stages.
Stage 1 is located in the MBR and mainly points to Stage 2, since the MBR is too small to contain all of the needed data.
Stage 2 points to its configuration file, which contains all of the complex user interface and options we are normally familiar with when talking about GRUB. Stage 2 can be located anywhere on the disk. If Stage 2 cannot find its configuration table, GRUB will cease the boot sequence and present the user with a command line for manual configuration.
Stage 1.5 also exists and might be used if the boot information is small enough to fit in the area immediately after MBR.

GRUB 2 has replaced the first version of GRUB which is hence now called GRUB Legacy. GRUB 2 has better portability and modularity, supports non-ASCII characters, dynamic loading of modules, real memory management, and more. 

Kernel initialization

The kernel itself is a program which on Linux is usually located as some variant of ‘/boot/vmlinuz’. On my system it is
After the kernel is loaded it probes the system for how much RAM is available. It reserves some memory for itself for its own statically sized data structures. Then the kernel probes for what hardware is present and for loading device drivers.  Mostly kernel loads device drivers as independent kernel modules so a small Linux kernel can support a large number of hardware devices. Different GNU/Linux distributions can come with different driver modules bundled with the kernel.
After basic initialization the kernel creates several “spontaneous” processes. They are called spontaneous because they are not started in user space with regular ‘fork’  mechanism. we can see the processes with
ps -A
Most system started process have low PIDs. Process with PID 0 is the init system.

Init process

init is the first process to run and always has PID 1. Different systems may have different implementations of init e.g last time I used Ubuntu, it was using ‘upstart’. We can check the init system on our machine with
=> ps -A | head -n 2
=> PID TTY          TIME CMD
    1  ?        00:00:00 systemd
Init script is generally ‘/sbin/init’ which can be a symbolic link to the init daemon
=> file /sbin/init
=> /sbin/init: symbolic link to `../usr/lib/systemd/systemd’
Some commonly performed tasks by these scripts are
- settings the name of the computer
- setting the time zone
- checking disks with ‘fsck’
- mounting the system’s disks
- removing old files from ‘/tmp’
- configuring network interfaces
- starting up daemons and network services
init have a concept of run  levels. Different init systems may use different terms for run levels (e.g systemd call ‘em targets). But the purpose is similar in most cases.
init defines at least seven run levels each of which represents a particular complement of services that the system should be running. Exact definitions may vary among systems but general points are
- at level 0, system is completely shut down
- level 1 represents single user mode
- level 2 to 5 include support for networking
- level 6 is ‘reboot’
The number of levels on different systems vary, as their names. The system is now booted, and proceeds depending on the run level it’s been set for.
You can get more information about the run levels here. For an excerpt

0 - Halt the system.
1 - Single-user mode (for special administration).
2 - Local Multiuser with Networking but without network service (like NFS)
3 - Full Multiuser with Networking
4 - Not Used
5 - Full Multiuser with Networking and X Windows(GUI)
6 - Reboot.