Some of the content might be incorrect, since I am still trying to understand it throughly.

The x86 architecture provides 3 modes, those are:

  • Real mode

    • 16-bit mode;

    • No mode switch (always privileged);

    • No virtual memory (memory segmentation only);

      • 20-bit address space;

      • Linear address = Physical address;

      • The size of each segment is 64k (16-bit);

  • Protected mode

    • 32-bit mode;

    • Segmentation and Virtual memory;

      • 32-bit/36-bit physical address space;

  • Long mode

    • 64-bit mode;

    • Virtual memory only;

    • Compatibility mode, so we can execute 32-bit code;

      • 48-bit physical address space

Real mode and powering the machine

Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. Real mode is characterized by a 20-bit segmented memory address space (giving exactly 1 MiB of addressable memory) and unlimited direct software access to all addressable memory, I/O addresses and peripheral hardware. Real mode provides no support for memory protection, multitasking, or code privilege levels. Before the release of the 80286, which introduced protected mode, real mode was the only available mode for x86 CPUs. In the interest of backward compatibility, all x86 CPUs start in real mode when reset, though it is possible to emulate real mode on other systems when starting on other modes.

— Wikipedia Real mode - https://en.wikipedia.org/wiki/Real_mode

When your computer boots, the CPU resets all leftover data in its registers and sets up some predefined values for them. Those predefined (by design) register is where the CPU will search for the first instruction to execute.

The Intel 80386 Programmer’s Reference Manual written in 1986, specifies the predefined values. Those are:

  • EFLAGS = 00000002H

  • IP = 0000FFF0H

  • CS = 000H

  • DS = 0000H

  • ES = 0000H

  • SS = 0000H

  • FS = 0000H

  • GS = 0000H

We use Intels 80386/8086 as reference, since all later CPUs define the same data into their registers.

The registers we are interrested are the following:

  • CS: Code segment register

    • Points to the current active code segment;

  • DS: Data segment register

    • Points to the default data segment (global and static variables of the running program);

  • ES: Extra segment register

    • General purpose segment register (mostly for data transfer);

  • SS: Stack segment register

    • Points to the segment containing the active stack;

  • FS/GS: General purpose segment register:

    • Used for any purpose in your application;

A better overview of these registers:

Real mode segmented model

From C-Jump

After all these initializations have happened, the computer starts working in real mode. Hangon, what about the 20-bit (2^20 = 1MB) segmented address space? The CPU only has 16-bit registers, giving us a maximum of 2^16 - 1 (64kB) address. For that reason, we make use of Memory Segmentation to use all the address space available. Memory Segmentation means that all memory is divided into a small and fixed-sized segments of 64KB (linearly). An address contains two parts: a segment selector, which contains a base address and an offset from this base address, so an offset neets to be paired with a segment address to locate data in memory (segment:offset forms a 20-bit address out of two 16-bit address).

Segments and offsets are related to the physical addresses by the following equation: physical address = segment * 16 + offset For example: hex((0x2000 * 16) + 0x0010). In ASM is much easier for us to use bit shifting, so we could do something like: (0x2000 << 4) + 0x0010.

To clear things out, if our CS register’s current value is 0x0100 and our IPs current value is 0x0200, we can find our physical address by using: P = (0x0100 << 4) + 0x0200 = 0x1200.

The above statment could be incorrect.

Let’s try to visualize how this 20-bit segmented memory address space looks like:

20-bit segmented memory address

From Kurtqia

The IP register is 16-bits, meaning we can only have 64K instructions, so to expand the address space, we have another CS register. Together IP:CS gives us 32-bit register, allowing us to address 4294967296 bytes (4GB).

To read more about this, check Part 1 - Kernel booting process by Linux Inside