diff options
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/briefs/tb0002-x86_64_bootstrap.rst | 154 | ||||
| -rw-r--r-- | docs/requirements.txt | 1 |
2 files changed, 154 insertions, 1 deletions
diff --git a/docs/briefs/tb0002-x86_64_bootstrap.rst b/docs/briefs/tb0002-x86_64_bootstrap.rst new file mode 100644 index 0000000..b7a6c2a --- /dev/null +++ b/docs/briefs/tb0002-x86_64_bootstrap.rst @@ -0,0 +1,154 @@ +Technical Brief 0002: x86-64 Bootstrap Subsystem +================================================ + +System Requirements and Constraints +----------------------------------- + +The design of a clean-slate, C++23-based operating system kernel necessitates a low-level bootstrap subsystem. +This subsystem manages the transition from the machine's power-on state to a controlled 64-bit execution environment. +It must operate under several architectural and toolchain constraints: + +1. **Bootloader Conformance:** The kernel is loaded by a bootloader adhering to the Multiboot2 Specification. + This conformance establishes a critical contract between the bootloader and the kernel. + The bootstrap code must therefore correctly identify the Multiboot2 magic number (``0x36d76289``) passed in the ``%eax`` register. + It must also interpret the pointer to the boot information structure passed in ``%ebx`` [1]_. + Adhering to this standard decouples the kernel from any specific bootloader implementation, ensuring portability across compliant environments like GRUB 2. + +2. **CPU Mode Transition:** The CPU is assumed to be in 32-bit protected mode upon entry to the bootstrap code. + The subsystem is responsible for all requisite steps to enable 64-bit long mode. + This is a non-trivial process. + It involves enabling Physical Address Extension (PAE) via the ``%cr4`` control register, setting the Long Mode Enable (LME) bit in the Extended Feature Enable Register (EFER) MSR (``0xC0000080``), and finally enabling paging via the ``%cr0`` control register. + +3. **Position-Independent Executable (PIE):** The kernel is compiled and linked as a PIE to allow it to be loaded at an arbitrary physical address. + This imposes a strict constraint on the 32-bit assembly code: it must not contain any absolute address relocations. + While a C++ compiler can generate position-independent code automatically, in hand-written assembly this requires the manual calculation of all symbol addresses at runtime. + This is a significant departure from simpler, absolute-addressed code. + +Architectural Overview +---------------------- + +The bootstrap architecture is partitioned into three distinct components. +This enforces a modular and verifiable transition sequence. +The components are: a shared C++/assembly interface (``boot.hpp``), a 32-bit PIE transition stage (``boot32.S``), and a minimal 64-bit entry stage (``entry64.s``). +This separation is a deliberate design choice to manage complexity. +It ensures that mode-specific logic is isolated, preventing subtle bugs that could arise from mixing 32-bit and 64-bit concerns. +Furthermore, it makes the state transition between each stage explicit and auditable. +This is critical for both debugging and for the educational utility of the codebase. + +Component Analysis +------------------ + +C++/Assembly Interface (``boot.hpp``) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A single header file serves as the definitive interface between assembly code and C++. +This is achieved through the use of the ``__ASSEMBLER__`` preprocessor macro. +This is a standard feature of the GNU toolchain that allows a single file to serve a dual purpose. + +* **Shared Constants:** The header defines all magic numbers (e.g., ``MULTIBOOT2_MAGIC``), GDT flags, and other constants required by both the assembly and C++ code. + This ensures a single source of truth, eliminating the risk of inconsistencies that could arise from maintaining parallel definitions in different language domains. + +* **Conditional Declarations:** C++-specific declarations, such as ``extern "C"`` variable declarations using the ``teachos::arch::asm_pointer`` wrapper, are confined within an ``#ifndef __ASSEMBLER__`` block. + This prevents the assembler from attempting to parse C++ syntax—which would result in a compilation error—while making the full, type-safe interface available to the C++ compiler. + The ``asm_pointer`` class is particularly important. + It encapsulates a raw address and prevents its unsafe use as a standard pointer within C++, forcing any interaction to be explicit and controlled. + +32-bit Transition Stage (``boot32.S``) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This file contains all code and data necessary to prepare the system for long mode. +Its logic is fundamentally incompatible with the 64-bit environment due to differences in stack width, calling conventions, and instruction encoding. + +* **Position-Independent Execution (PIE):** The 32-bit x86 ISA lacks a native instruction-pointer-relative addressing mode. + To satisfy the PIE constraint, all symbol addresses are calculated at runtime. + This is achieved via the ``call/pop`` idiom to retrieve the value of the instruction pointer (``%eip``) into a base register (``%esi``). + All subsequent memory accesses are then performed by calculating a link-time constant offset from this runtime base (e.g., ``leal (symbol - .Lbase)(%esi), %eax``). + This manual implementation of position independence is critical to avoid linker errors related to absolute relocations (``R_X86_64_32``) in a PIE binary. + +* **System State Verification:** The first actions are a series of assertions. + The code first verifies the Multiboot2 magic number (``0x36d76289``) passed in ``%eax`` [1]_. + It then uses the ``CPUID`` instruction to verify that the processor supports long mode. + This is done by checking for the LM bit (bit 29) in ``%edx`` after executing ``CPUID`` with ``0x80000001`` in ``%eax`` [2]_. + Failure of any assertion results in a call to a panic routine that halts the system. + This "fail-fast" approach is crucial; proceeding in an unsupported environment would lead to unpredictable and difficult-to-debug faults deep within the kernel. + +* **Formal Transition via ``lret``:** The stage concludes with a ``lret`` (long return) instruction. + This is the architecturally mandated method for performing an inter-segment control transfer. + This is required to load a new code segment selector and change the CPU's execution mode. + A simple ``jmp`` is insufficient as it cannot change the execution mode. + The choice of ``lret`` over other far-control transfer instructions like ``ljmp`` or ``lcall`` is a direct consequence of the PIE constraint. + The direct forms of ``ljmp`` and ``lcall`` require their target address to be a link-time constant. + This would embed an absolute address into the executable and violate the principles of position independence. + In contrast, ``lret`` consumes its target selector and offset from the stack. + This mechanism is perfectly suited for a PIE environment. + It allows for a dynamically calculated, position-independent address to be pushed onto the stack immediately before the instruction is executed. + Furthermore, ``lcall`` is architecturally inappropriate. + It would push a 32-bit return address onto the stack before the mode switch, corrupting the 64-bit stack frame for a transition that should be strictly one-way. + ``lret`` correctly models this one-way transfer and is therefore the only viable and clean option. + +64-bit Entry Stage (``entry64.s``) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This file provides a minimal, clean entry point into the 64-bit world. +It ensures the C++ kernel begins execution in a pristine environment. + +* **Final State Setup:** Its sole responsibilities are to initialize the 64-bit data segment registers (``%ss``, ``%ds``, etc.) with the correct selector from the new GDT. + It then transfers control to the C++ kernel's ``main`` function via a standard ``call``. + Setting the segment registers is the first action performed. + Any memory access in 64-bit mode—including the stack operations performed by the subsequent ``call``—depends on these selectors being valid. + Failure to do so would result in a general protection fault. + +* **Halt State:** Should ``main`` ever return—an event that signifies a critical kernel failure—execution falls through to an infinite ``hlt`` loop. + This is a crucial fail-safe. + It prevents the CPU from executing beyond the end of the kernel's code, which would lead to unpredictable behavior as the CPU attempts to interpret non-executable data as instructions. + +Key Implementation Decisions +---------------------------- + +``lret`` Stack Frame Construction +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The transition to 64-bit mode is initiated by executing an ``lret`` instruction from 32-bit protected mode. +The behavior of this instruction is determined by the characteristics of the destination code segment descriptor referenced by the selector on the stack. + +The stack is prepared as follows: + +1. ``leal (_entry64 - .Lbase)(%esi), %eax``: The PIE-compatible virtual address of the 64-bit entry point is calculated and placed in ``%eax``. + +2. ``pushl $global_descriptor_table_code``: The 16-bit selector for the 64-bit code segment is pushed onto the stack as a 32-bit value. + +3. ``pushl %eax``: The 32-bit address of the entry point is pushed onto the stack. + +When ``lret`` is executed in 32-bit mode, it pops a 32-bit instruction pointer and a 16-bit code selector from the stack [3]_. +The processor then examines the GDT descriptor referenced by the new code selector. +Because this descriptor has its L-bit (Long Mode) set to 1, the processor transitions into 64-bit long mode. +It then begins executing at the 64-bit address specified by the popped instruction pointer [2]_. + +Memory Virtualization and GDT +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A four-level page table hierarchy (PML4) is constructed to enable paging, a prerequisite for long mode. +An initial identity map of 32 MiB of physical memory is created using 2 MiB huge pages. +This reduces the number of required page table entries for the initial kernel image. +A recursive mapping in the PML4 at a conventional index (511) is also established. +This powerful technique allows the C++ kernel's memory manager to access and modify the entire page table hierarchy as if it were a linear array at a single, well-known virtual address. +This greatly simplifies the logic required for virtual memory operations. + +A new GDT is defined containing the necessary null, 64-bit code, and 64-bit data descriptors. +The first entry in the GDT must be a null descriptor, as the processor architecture reserves selector value 0 as a special "null selector." +Loading a segment register with this null selector is valid. +However, any subsequent memory access using it (with the exception of CS or SS) will generate a general-protection exception. +This provides a fail-safe mechanism against the use of uninitialized segment selectors [2]_. +The selector for the data descriptor is exported as a global symbol (``global_descriptor_table_data``). +This design choice was made to prioritize explicitness and debuggability. +The dependency is clearly visible in the source code, over the alternative of passing the selector value in a register. +This would create an implicit, less obvious contract between the two stages that could complicate future maintenance. + +References +---------- + +.. [1] Free Software Foundation, "The Multiboot2 Specification, version 2.0," Free Software Foundation, Inc., 2016. `Online <https://www.gnu.org/software/grub/manual/multiboot2/multiboot.html>`_. + +.. [2] Intel Corporation, *Intel® 64 and IA-32 Architectures Software Developer’s Manual, Combined Volumes 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4*, Order No. 325462-081US, July 2025. `Online <https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html>`_. + +.. [3] AMD, Inc., *AMD64 Architecture Programmer’s Manual, Volume 3: General-Purpose and System Instructions*, Publication No. 24594, Rev. 3.42, June 2025. `Online <https://www.amd.com/en/support/tech-docs/amd64-architecture-programmers-manual-volumes-1-5>`_. diff --git a/docs/requirements.txt b/docs/requirements.txt index 3c3a2e5..733e873 100644 --- a/docs/requirements.txt +++ b/docs/requirements.txt @@ -1,3 +1,2 @@ Sphinx~=8.2.0 -breathe~=4.36.0 sphinx_book_theme~=1.1.0 |
