docs/briefs/tb0002-x86_64_bootstrap.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154

Technical Brief 0002: x86-64 Bootstrap Subsystem
================================================

System Requirements and Constraints
-----------------------------------

The design of a clean-slate, C++23-based operating system kernel necessitates a low-level bootstrap subsystem.
This subsystem manages the transition from the machine's power-on state to a controlled 64-bit execution environment.
It must operate under several architectural and toolchain constraints:

1.  **Bootloader Conformance:** The kernel is loaded by a bootloader adhering to the Multiboot2 Specification.
    This conformance establishes a critical contract between the bootloader and the kernel.
    The bootstrap code must therefore correctly identify the Multiboot2 magic number (``0x36d76289``) passed in the ``%eax`` register.
    It must also interpret the pointer to the boot information structure passed in ``%ebx`` [1]_.
    Adhering to this standard decouples the kernel from any specific bootloader implementation, ensuring portability across compliant environments like GRUB 2.

2.  **CPU Mode Transition:** The CPU is assumed to be in 32-bit protected mode upon entry to the bootstrap code.
    The subsystem is responsible for all requisite steps to enable 64-bit long mode.
    This is a non-trivial process.
    It involves enabling Physical Address Extension (PAE) via the ``%cr4`` control register, setting the Long Mode Enable (LME) bit in the Extended Feature Enable Register (EFER) MSR (``0xC0000080``), and finally enabling paging via the ``%cr0`` control register.

3.  **Position-Independent Executable (PIE):** The kernel is compiled and linked as a PIE to allow it to be loaded at an arbitrary physical address.
    This imposes a strict constraint on the 32-bit assembly code: it must not contain any absolute address relocations.
    While a C++ compiler can generate position-independent code automatically, in hand-written assembly this requires the manual calculation of all symbol addresses at runtime.
    This is a significant departure from simpler, absolute-addressed code.

Architectural Overview
----------------------

The bootstrap architecture is partitioned into three distinct components.
This enforces a modular and verifiable transition sequence.
The components are: a shared C++/assembly interface (``boot.hpp``), a 32-bit PIE transition stage (``boot32.S``), and a minimal 64-bit entry stage (``entry64.s``).
This separation is a deliberate design choice to manage complexity.
It ensures that mode-specific logic is isolated, preventing subtle bugs that could arise from mixing 32-bit and 64-bit concerns.
Furthermore, it makes the state transition between each stage explicit and auditable.
This is critical for both debugging and for the educational utility of the codebase.

Component Analysis
------------------

C++/Assembly Interface (``boot.hpp``)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A single header file serves as the definitive interface between assembly code and C++.
This is achieved through the use of the ``__ASSEMBLER__`` preprocessor macro.
This is a standard feature of the GNU toolchain that allows a single file to serve a dual purpose.

* **Shared Constants:** The header defines all magic numbers (e.g., ``MULTIBOOT2_MAGIC``), GDT flags, and other constants required by both the assembly and C++ code.
  This ensures a single source of truth, eliminating the risk of inconsistencies that could arise from maintaining parallel definitions in different language domains.

* **Conditional Declarations:** C++-specific declarations, such as ``extern "C"`` variable declarations using the ``teachos::arch::asm_pointer`` wrapper, are confined within an ``#ifndef __ASSEMBLER__`` block.
  This prevents the assembler from attempting to parse C++ syntax—which would result in a compilation error—while making the full, type-safe interface available to the C++ compiler.
  The ``asm_pointer`` class is particularly important.
  It encapsulates a raw address and prevents its unsafe use as a standard pointer within C++, forcing any interaction to be explicit and controlled.

32-bit Transition Stage (``boot32.S``)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This file contains all code and data necessary to prepare the system for long mode.
Its logic is fundamentally incompatible with the 64-bit environment due to differences in stack width, calling conventions, and instruction encoding.

* **Position-Independent Execution (PIE):** The 32-bit x86 ISA lacks a native instruction-pointer-relative addressing mode.
  To satisfy the PIE constraint, all symbol addresses are calculated at runtime.
  This is achieved via the ``call/pop`` idiom to retrieve the value of the instruction pointer (``%eip``) into a base register (``%esi``).
  All subsequent memory accesses are then performed by calculating a link-time constant offset from this runtime base (e.g., ``leal (symbol - .Lbase)(%esi), %eax``).
  This manual implementation of position independence is critical to avoid linker errors related to absolute relocations (``R_X86_64_32``) in a PIE binary.

* **System State Verification:** The first actions are a series of assertions.
  The code first verifies the Multiboot2 magic number (``0x36d76289``) passed in ``%eax`` [1]_.
  It then uses the ``CPUID`` instruction to verify that the processor supports long mode.
  This is done by checking for the LM bit (bit 29) in ``%edx`` after executing ``CPUID`` with ``0x80000001`` in ``%eax`` [2]_.
  Failure of any assertion results in a call to a panic routine that halts the system.
  This "fail-fast" approach is crucial; proceeding in an unsupported environment would lead to unpredictable and difficult-to-debug faults deep within the kernel.

* **Formal Transition via ``lret``:** The stage concludes with a ``lret`` (long return) instruction.
  This is the architecturally mandated method for performing an inter-segment control transfer.
  This is required to load a new code segment selector and change the CPU's execution mode.
  A simple ``jmp`` is insufficient as it cannot change the execution mode.
  The choice of ``lret`` over other far-control transfer instructions like ``ljmp`` or ``lcall`` is a direct consequence of the PIE constraint.
  The direct forms of ``ljmp`` and ``lcall`` require their target address to be a link-time constant.
  This would embed an absolute address into the executable and violate the principles of position independence.
  In contrast, ``lret`` consumes its target selector and offset from the stack.
  This mechanism is perfectly suited for a PIE environment.
  It allows for a dynamically calculated, position-independent address to be pushed onto the stack immediately before the instruction is executed.
  Furthermore, ``lcall`` is architecturally inappropriate.
  It would push a 32-bit return address onto the stack before the mode switch, corrupting the 64-bit stack frame for a transition that should be strictly one-way.
  ``lret`` correctly models this one-way transfer and is therefore the only viable and clean option.

64-bit Entry Stage (``entry64.s``)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This file provides a minimal, clean entry point into the 64-bit world.
It ensures the C++ kernel begins execution in a pristine environment.

* **Final State Setup:** Its sole responsibilities are to initialize the 64-bit data segment registers (``%ss``, ``%ds``, etc.) with the correct selector from the new GDT.
  It then transfers control to the C++ kernel's ``main`` function via a standard ``call``.
  Setting the segment registers is the first action performed.
  Any memory access in 64-bit mode—including the stack operations performed by the subsequent ``call``—depends on these selectors being valid.
  Failure to do so would result in a general protection fault.

* **Halt State:** Should ``main`` ever return—an event that signifies a critical kernel failure—execution falls through to an infinite ``hlt`` loop.
  This is a crucial fail-safe.
  It prevents the CPU from executing beyond the end of the kernel's code, which would lead to unpredictable behavior as the CPU attempts to interpret non-executable data as instructions.

Key Implementation Decisions
----------------------------

``lret`` Stack Frame Construction
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The transition to 64-bit mode is initiated by executing an ``lret`` instruction from 32-bit protected mode.
The behavior of this instruction is determined by the characteristics of the destination code segment descriptor referenced by the selector on the stack.

The stack is prepared as follows:

1.  ``leal (_entry64 - .Lbase)(%esi), %eax``: The PIE-compatible virtual address of the 64-bit entry point is calculated and placed in ``%eax``.

2.  ``pushl $global_descriptor_table_code``: The 16-bit selector for the 64-bit code segment is pushed onto the stack as a 32-bit value.

3.  ``pushl %eax``: The 32-bit address of the entry point is pushed onto the stack.

When ``lret`` is executed in 32-bit mode, it pops a 32-bit instruction pointer and a 16-bit code selector from the stack [3]_.
The processor then examines the GDT descriptor referenced by the new code selector.
Because this descriptor has its L-bit (Long Mode) set to 1, the processor transitions into 64-bit long mode.
It then begins executing at the 64-bit address specified by the popped instruction pointer [2]_.

Memory Virtualization and GDT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A four-level page table hierarchy (PML4) is constructed to enable paging, a prerequisite for long mode.
An initial identity map of 32 MiB of physical memory is created using 2 MiB huge pages.
This reduces the number of required page table entries for the initial kernel image.
A recursive mapping in the PML4 at a conventional index (511) is also established.
This powerful technique allows the C++ kernel's memory manager to access and modify the entire page table hierarchy as if it were a linear array at a single, well-known virtual address.
This greatly simplifies the logic required for virtual memory operations.

A new GDT is defined containing the necessary null, 64-bit code, and 64-bit data descriptors.
The first entry in the GDT must be a null descriptor, as the processor architecture reserves selector value 0 as a special "null selector."
Loading a segment register with this null selector is valid.
However, any subsequent memory access using it (with the exception of CS or SS) will generate a general-protection exception.
This provides a fail-safe mechanism against the use of uninitialized segment selectors [2]_.
The selector for the data descriptor is exported as a global symbol (``global_descriptor_table_data``).
This design choice was made to prioritize explicitness and debuggability.
The dependency is clearly visible in the source code, over the alternative of passing the selector value in a register.
This would create an implicit, less obvious contract between the two stages that could complicate future maintenance.

References
----------

.. [1] Free Software Foundation, "The Multiboot2 Specification, version 2.0," Free Software Foundation, Inc., 2016. `Online <https://www.gnu.org/software/grub/manual/multiboot2/multiboot.html>`_.

.. [2] Intel Corporation, *Intel® 64 and IA-32 Architectures Software Developer’s Manual, Combined Volumes 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4*, Order No. 325462-081US, July 2025. `Online <https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html>`_.

.. [3] AMD, Inc., *AMD64 Architecture Programmer’s Manual, Volume 3: General-Purpose and System Instructions*, Publication No. 24594, Rev. 3.42, June 2025. `Online <https://www.amd.com/en/support/tech-docs/amd64-architecture-programmers-manual-volumes-1-5>`_.