3. Physical Memory
Chapter 2 proved the kernel runs under OpenSBI. The device tree pointer arrived in a1 but was never read. The kernel has no idea what hardware surrounds it.
This chapter fixes that. By the end, the kernel will:
- parse the device tree to find RAM and count CPUs
- manage physical memory with a buddy frame allocator
- reserve the kernel image, device tree, firmware, and DTB-advertised reserved memory so they are not overwritten
jikei: booted under OpenSBIKernel: 0x80200000 - 0x8020c000 (48 KB) .text: 8 KB, .rodata: 4 KB, .data: 4 KB, .bss: 28 KBCPUs: 1RAM: 0x80000000 - 0x90000000 (256 MB)Reserved: 0x80200000 - 0x8038c000Reserved: 0x8fe00000 - 0x8fe02000Reserved: 0x80000000 - 0x80200000Memory: 64000 free frames (250 MB), 65536 totalThe exact sizes depend on your toolchain. The shape matters: kernel info, CPU count, RAM discovery, reserved ranges, and free frame count.
What Changes
Section titled “What Changes”This is the largest chapter so far. The project gets its first real structure.
| File | Status |
|---|---|
Cargo.toml | add fdt and spin dependencies |
link.ld | add section symbols and _kernel_end |
src/main.rs | restructure around boot::init |
src/boot.rs | new — DTB parsing and memory init |
src/utils/mod.rs | new — groups infrastructure modules |
src/utils/sbi.rs | moved from src/sbi.rs and upgraded |
src/utils/console.rs | new — println! macro |
src/utils/panic.rs | moved from src/panic.rs, uses println! |
src/utils/symbols.rs | new — linker symbol accessors |
src/memory/mod.rs | new — allocator interface |
src/memory/frame_allocator.rs | new — buddy allocator |
boot.S | no change |
.cargo/config.toml | no change |
Delete src/sbi.rs and src/panic.rs — they move into src/utils/.
Add Dependencies
Section titled “Add Dependencies”Update Cargo.toml:
[package]name = "rv-kernel"version = "0.1.0"edition = "2024"
[dependencies]fdt = { version = "0.1.5", features = ["pretty-printing"] }spin = "0.9"
[[bin]]name = "rv-kernel"path = "src/main.rs"test = falsebench = falsefdt parses the flattened device tree that OpenSBI passed in a1. spin provides a mutex that works without an OS — it spins instead of sleeping. We need it to protect the frame allocator from concurrent access once multiple harts are running.
Organize the Project
Section titled “Organize the Project”The project is growing beyond a few flat files. Group infrastructure into modules:
src/├── main.rs├── boot.rs├── utils/│ ├── mod.rs│ ├── sbi.rs│ ├── console.rs│ ├── panic.rs│ └── symbols.rs└── memory/ ├── mod.rs └── frame_allocator.rsutils/ holds console output, SBI calls, linker symbols, and the panic handler. memory/ holds the frame allocator. The sections below create each file.
Upgrade SBI Calls
Section titled “Upgrade SBI Calls”Chapter 2 used inline ecall with a hardcoded extension ID. Now that later chapters will add timers and hart startup, wrap the calling convention once.
Create src/utils/sbi.rs:
#[repr(usize)]enum SbiExt { ConsolePutchar = 0x01,}
fn sbi_call(ext_id: SbiExt, fun_id: usize, args: [usize; 3]) -> (isize, usize) { let mut error: isize; let mut value: usize; unsafe { core::arch::asm!( "ecall", in("a7") ext_id as usize, in("a6") fun_id, inlateout("a0") args[0] => error, inlateout("a1") args[1] => value, in("a2") args[2], ) } (error, value)}
pub fn console_write(buf: &[u8]) { for &byte in buf { sbi_call(SbiExt::ConsolePutchar, 0, [byte as usize, 0, 0]); }}The SBI calling convention uses a7 for the extension ID, a6 for the function ID, and a0–a2 for arguments. a0 returns an error code and a1 returns a value.
For now the only extension is legacy console putchar. Later chapters add timer and hart management extensions to the SbiExt enum.
The puts and put_hex helpers from Chapter 2 are gone. println! replaces them.
Add Formatted Output
Section titled “Add Formatted Output”Create src/utils/console.rs:
use super::sbi;use core::fmt::{self, Write};use spin::Mutex;
static CONSOLE: Mutex<Console> = Mutex::new(Console);
pub struct Console;
impl Console { pub fn locked_write_fmt(args: fmt::Arguments) { let _ = Write::write_fmt(&mut *CONSOLE.lock(), args); }}
impl Write for Console { fn write_str(&mut self, s: &str) -> fmt::Result { sbi::console_write(s.as_bytes()); Ok(()) }}
#[macro_export]macro_rules! println { ($($arg:tt)*) => { $crate::print!("{}\n", format_args!($($arg)*)) };}
#[macro_export]macro_rules! print { ($($arg:tt)*) => { $crate::utils::console::Console::locked_write_fmt(format_args!($($arg)*)) };}Console implements core::fmt::Write, which gives us {}, {:#x}, and every other format specifier for free. The Mutex ensures output from multiple harts does not interleave mid-line.
#[macro_export] makes println! and print! available everywhere in the crate.
Update the Panic Handler
Section titled “Update the Panic Handler”Create src/utils/panic.rs:
use core::panic::PanicInfo;
#[panic_handler]pub fn panic(info: &PanicInfo) -> ! { println!("\n=== KERNEL PANIC ==="); println!("{}\n", info.message());
loop { unsafe { core::arch::asm!("wfi") } }}The panic handler now prints the full panic message instead of a fixed string.
Wire Up the Utils Module
Section titled “Wire Up the Utils Module”Create src/utils/mod.rs:
#[macro_use]pub mod console;pub mod panic;pub mod sbi;pub mod symbols;#[macro_use] on console exports the println! and print! macros to the rest of the crate.
Update the Linker Script
Section titled “Update the Linker Script”The frame allocator needs to know where the kernel image ends. The boot code needs section boundaries for size reporting. Add symbols and page-align each section.
Update link.ld:
ENTRY(_start)
SECTIONS { . = 0x80200000; /* OpenSBI loads kernel here */
.text : { _text_start = .; KEEP(*(.text.init)) *(.text .text.*) _text_end = .; }
. = ALIGN(4096); .rodata : { _rodata_start = .; *(.srodata .srodata.*) *(.rodata .rodata.*) _rodata_end = .; }
. = ALIGN(4096); .data : { *(.sdata .sdata.*) *(.data .data.*) _data_end = .; }
.bss : { . = ALIGN(8); _bss_start = .; *(.sbss .sbss.*) *(.bss .bss.*) . = ALIGN(8); _bss_end = .; }
. = ALIGN(4096); _kernel_end = .;}_kernel_end is the first free address after the kernel image, page-aligned. The frame allocator stores its metadata starting here.
Page-aligning .rodata and .data prepares for Chapter 4, where each section needs different page permissions (execute, read-only, read-write).
Read Linker Symbols from Rust
Section titled “Read Linker Symbols from Rust”Create src/utils/symbols.rs:
unsafe extern "C" { static _text_start: u8; static _text_end: u8; static _rodata_start: u8; static _rodata_end: u8; static _data_end: u8; static _bss_end: u8; static _kernel_end: u8;}
macro_rules! sym { ($name:ident, $sym:ident) => { pub fn $name() -> usize { (&raw const $sym) as usize } };}
sym!(text_start, _text_start);sym!(text_end, _text_end);sym!(rodata_start, _rodata_start);sym!(rodata_end, _rodata_end);sym!(data_end, _data_end);sym!(bss_end, _bss_end);sym!(kernel_end, _kernel_end);Each linker symbol is declared as an extern "C" byte. We never read the byte — we take its address. The sym! macro wraps this into a function that returns the address as usize.
&raw const takes the address without creating a reference, which avoids undefined behavior for linker symbols that do not point to real Rust data.
The Buddy Allocator
Section titled “The Buddy Allocator”This is the largest new piece. The allocator manages physical memory as 4 KiB frames and supports allocating contiguous blocks of 2^order frames.
How It Works
Section titled “How It Works”- Memory is divided into frames. Each frame is one page (4 KiB).
- Frames are grouped into blocks of power-of-2 sizes: 1, 2, 4, … up to 1024 frames.
- Each block size has a free list. To allocate, find the smallest free block that fits and split larger blocks in half as needed.
- When freeing, check if the buddy — the adjacent block at the same level — is also free. If so, merge them into a larger block. Repeat upward.
The buddy of block i at order k is i XOR (1 << k). This is the key insight: the XOR flips exactly the bit that distinguishes the two halves of a pair.
Data Structures
Section titled “Data Structures”Create src/memory/frame_allocator.rs. The sections below build up the file from top to bottom.
use core::{mem::size_of, ptr::NonNull};
pub const PAGE_SIZE: usize = 4096;const MAX_ORDER: usize = 11; // orders 0..=10, max block = 1024 frames = 4 MiB
/// Buddy allocator for physical frames.////// Stored inline at a fixed address:/// `[BuddyAllocator] [Region × num_regions] [Frame × num_frames]`#[repr(C)]pub struct BuddyAllocator { num_regions: usize, pub num_frames: usize, pub free_count: usize, free_lists: [Option<NonNull<Frame>>; MAX_ORDER],}
unsafe impl Send for BuddyAllocator {}
#[repr(C)]pub struct Region { pub start: usize, pub size: usize,}
#[repr(C)]struct Frame { order: u8, free: bool, prev: Option<NonNull<Frame>>, next: Option<NonNull<Frame>>,}BuddyAllocator lives at a fixed physical address right after the kernel image. Its Region and Frame arrays follow it inline in memory:
[BuddyAllocator struct] [Region × N] [Frame × M]This layout means no heap is needed to bootstrap the allocator.
Frame tracks per-frame state: its current order, whether it is free, and doubly-linked list pointers.
unsafe impl Send is required because the struct contains NonNull pointers. Since the allocator is always accessed through a Mutex, this is safe.
Constructor and Layout Helpers
Section titled “Constructor and Layout Helpers”impl BuddyAllocator { pub const fn new() -> Self { Self { num_regions: 0, num_frames: 0, free_count: 0, free_lists: [None; MAX_ORDER], } }
pub fn regions_ptr(&self) -> *mut Region { unsafe { (self as *const Self).add(1).cast::<Region>().cast_mut() } }
fn frames_ptr(&self) -> *mut Frame { self.regions_ptr() .wrapping_add(self.num_regions) .cast::<Frame>() }
fn base(&self) -> usize { self.regions().iter().map(|r| r.start).min().unwrap() / PAGE_SIZE }
/// Address past the end of the frame array. pub fn end_ptr(&self) -> usize { let base = self.base(); let max = self .regions() .iter() .map(|r| (r.start + r.size) / PAGE_SIZE) .max() .unwrap(); self.frames_ptr() as usize + (max - base) * size_of::<Frame>() }
pub fn regions(&self) -> &[Region] { unsafe { core::slice::from_raw_parts(self.regions_ptr(), self.num_regions) } }
fn frame(&mut self, i: usize) -> &mut Frame { debug_assert!(i < self.num_frames); unsafe { &mut *self.frames_ptr().add(i) } }
fn frame_nn(&mut self, i: usize) -> NonNull<Frame> { unsafe { NonNull::new_unchecked(self.frames_ptr().add(i)) } }regions_ptr() points just past the allocator struct. frames_ptr() points just past the region array. base() is the lowest frame number across all regions — frame indices are offsets from this base.
end_ptr() returns the first address past all frame metadata. This is critical: the allocator must reserve this range so it does not hand out pages containing its own bookkeeping.
Adding Regions
Section titled “Adding Regions” pub fn add_region(&mut self, start: usize, end: usize) { let start = start.next_multiple_of(PAGE_SIZE); let end = end & !(PAGE_SIZE - 1); if end > start { unsafe { self.regions_ptr().add(self.num_regions).write(Region { start, size: end - start, }); } self.num_regions += 1; } }Each RAM region from the device tree is page-aligned and stored inline. Regions are added before init() is called.
Free List Operations
Section titled “Free List Operations”The free lists are doubly-linked for O(1) insertion and removal:
fn list_push(&mut self, order: usize, node: NonNull<Frame>) { let n = unsafe { &mut *node.as_ptr() }; n.prev = None; n.next = self.free_lists[order]; if let Some(head) = self.free_lists[order] { unsafe { (*head.as_ptr()).prev = Some(node) }; } self.free_lists[order] = Some(node); }
fn list_remove(&mut self, order: usize, node: NonNull<Frame>) { let n = unsafe { &mut *node.as_ptr() }; match n.prev { Some(prev) => unsafe { (*prev.as_ptr()).next = n.next }, None => self.free_lists[order] = n.next, } if let Some(next) = n.next { unsafe { (*next.as_ptr()).prev = n.prev }; } n.prev = None; n.next = None; }Initialization
Section titled “Initialization” /// Zero-init all frames, then free every valid non-reserved frame. pub fn init(&mut self, reserved: &[(usize, usize)]) { let base = self.base(); let max = self .regions() .iter() .map(|r| (r.start + r.size) / PAGE_SIZE) .max() .unwrap(); self.num_frames = max - base;
for i in 0..self.num_frames { unsafe { self.frames_ptr().add(i).write(Frame { order: 0, free: false, prev: None, next: None, }); } }
let regions = unsafe { core::slice::from_raw_parts(self.regions_ptr(), self.num_regions) }; let in_ram = |a: usize| { regions.iter().any(|r| a >= r.start && a < r.start + r.size) }; let is_reserved = |a: usize| { reserved.iter().any(|&(s, e)| a < e && s < a + PAGE_SIZE) };
for i in 0..self.num_frames { let addr = (base + i) * PAGE_SIZE; if in_ram(addr) && !is_reserved(addr) { self.coalesce(i, 0); } } }init walks every frame. If the corresponding physical address is inside a RAM region and not reserved, it frees the frame via coalesce. Using coalesce instead of a simple list push means adjacent frames are automatically merged into larger blocks during setup.
The reserved list contains address ranges that must not be allocated: the kernel image (including allocator metadata) and the device tree.
Allocation
Section titled “Allocation” pub fn alloc(&mut self, order: usize) -> Option<usize> { let avail = (order..MAX_ORDER).find(|&o| self.free_lists[o].is_some())?;
let head = self.free_lists[avail].unwrap(); let idx = unsafe { head.as_ptr().offset_from(self.frames_ptr()) } as usize; self.list_remove(avail, head);
// Split down to requested order, pushing spare halves onto free lists. let mut cur = avail; while cur > order { cur -= 1; let buddy = idx + (1 << cur); self.frame(buddy).order = cur as u8; self.frame(buddy).free = true; let nn = self.frame_nn(buddy); self.list_push(cur, nn); }
self.frame(idx).order = order as u8; self.frame(idx).free = false; self.free_count -= 1 << order;
Some((self.base() + idx) * PAGE_SIZE) }Allocation searches upward from the requested order until it finds a free block. If the block is larger than needed, it splits repeatedly: at each step the upper half becomes a free buddy and the lower half is split again.
The return value is the physical address: (base + index) × PAGE_SIZE.
Deallocation and Coalescing
Section titled “Deallocation and Coalescing” #[allow(dead_code)] pub fn free(&mut self, addr: usize, order: usize) { self.coalesce(addr / PAGE_SIZE - self.base(), order); }
fn coalesce(&mut self, mut idx: usize, mut order: usize) { self.free_count += 1 << order;
while order < MAX_ORDER - 1 { let buddy = idx ^ (1 << order); if buddy >= self.num_frames { break; } if !self.frame(buddy).free || self.frame(buddy).order != order as u8 { break; } let nn = self.frame_nn(buddy); self.list_remove(order, nn); idx &= !(1 << order); order += 1; }
self.frame(idx).order = order as u8; self.frame(idx).free = true; let nn = self.frame_nn(idx); self.list_push(order, nn); }}coalesce is the heart of the buddy system. After marking a block free, it checks whether the buddy at the same order is also free. If so, both are removed from the free list and merged one order higher. This repeats until the buddy is occupied, is a different order, or the maximum order is reached.
idx ^ (1 << order) finds the buddy. idx &= !(1 << order) keeps the lower index as the merged block.
The Memory Module
Section titled “The Memory Module”Create src/memory/mod.rs:
pub mod frame_allocator;
use frame_allocator::BuddyAllocator;use spin::{Mutex, Once};
pub use frame_allocator::PAGE_SIZE;
static MEMORY: Once<Mutex<&'static mut BuddyAllocator>> = Once::new();
/// Initialize the frame allocator at a fixed physical address.pub fn init(addr: usize) { let ptr = addr as *mut BuddyAllocator; unsafe { ptr.write(BuddyAllocator::new()); MEMORY.call_once(|| Mutex::new(&mut *ptr)); }}
/// Lock the allocator for direct access.pub fn lock() -> spin::MutexGuard<'static, &'static mut BuddyAllocator> { MEMORY.get().expect("memory not initialized").lock()}
/// Allocate a single physical frame (4 KiB). Returns the physical address.pub fn alloc_frame() -> Option<usize> { lock().alloc(0)}Once ensures init runs exactly once and publishes the allocator to all subsequent callers. Mutex protects it from concurrent access.
alloc_frame is a convenience wrapper. Most early code needs single pages. Multi-frame allocation is added when we need it.
Parse the Device Tree
Section titled “Parse the Device Tree”Create src/boot.rs:
use crate::memory;use crate::utils::symbols::*;use fdt::Fdt;
const MAX_RESERVATIONS: usize = 64;
pub fn init(hartid: usize, dtb_ptr: usize) { let _ = hartid; println!("jikei: booted under OpenSBI"); print_kernel_info();
let fdt = unsafe { Fdt::from_ptr(dtb_ptr as *const u8) } .unwrap_or_else(|e| panic!("failed to parse DTB: {e}"));
let cpu_count = fdt.cpus().count(); println!("CPUs: {}", cpu_count);
// Place the frame allocator at the first free address after the kernel. memory::init(kernel_end());
{ let mut mem = memory::lock();
// Discover RAM regions from the device tree. let mut ram_min = usize::MAX; fdt.memory() .regions() .filter_map(|r| { r.size .filter(|&s| s > 0) .map(|s| (r.starting_address as usize, s)) }) .for_each(|(start, size)| { println!( "RAM: {:#x} - {:#x} ({} MB)", start, start + size, size >> 20 ); mem.add_region(start, start + size); ram_min = ram_min.min(start); });
// Reserve the kernel image (including allocator metadata), the DTB, // the pre-kernel firmware area, and any reserved ranges advertised // by the device tree. let alloc_end = mem.end_ptr(); let dtb_end = dtb_ptr + fdt.total_size();
let mut reservations: [(usize, usize); MAX_RESERVATIONS] = [(0, 0); MAX_RESERVATIONS]; let mut n = 0; let mut push = |r: (usize, usize)| { if r.1 <= r.0 { return; } if n == reservations.len() { panic!("too many reserved memory ranges; increase MAX_RESERVATIONS"); } reservations[n] = r; n += 1; };
push((text_start(), alloc_end)); push((dtb_ptr, dtb_end)); if ram_min != usize::MAX && ram_min < text_start() { push((ram_min, text_start())); } for r in fdt.memory_reservations() { let start = r.address() as usize; let end = start .checked_add(r.size()) .expect("FDT memory reservation overflows usize"); push((start, end)); } if let Some(rsv_root) = fdt.find_node("/reserved-memory") { for child in rsv_root.children() { if let Some(regs) = child.reg() { for r in regs { let start = r.starting_address as usize; let Some(size) = r.size else { continue; }; let end = start .checked_add(size) .expect("/reserved-memory range overflows usize"); push((start, end)); } } } }
for &(s, e) in &reservations[..n] { println!("Reserved: {:#x} - {:#x}", s, e); }
mem.init(&reservations[..n]);
println!( "Memory: {} free frames ({} MB), {} total", mem.free_count, (mem.free_count * 4096) >> 20, mem.num_frames, ); }}
fn print_kernel_info() { let kb = |a: usize, b: usize| (b - a) / 1024;
println!( "Kernel: {:#x} - {:#x} ({} KB)", text_start(), bss_end(), kb(text_start(), bss_end()), ); println!( " .text: {} KB, .rodata: {} KB, .data: {} KB, .bss: {} KB", kb(text_start(), text_end()), kb(rodata_start(), rodata_end()), kb(rodata_end(), data_end()), kb(data_end(), bss_end()), );}The init sequence:
- Print kernel section sizes from linker symbols.
- Parse the device tree to find CPUs and RAM.
- Place the
BuddyAllocatorstruct at_kernel_end. - Walk the DTB’s memory regions and register each one.
- Build a reservation list: the kernel image (from
_text_startthrough the allocator’s own metadata atend_ptr()), the device tree blob, the pre-kernel firmware area, and anymemreserve//reserved-memoryranges from the DTB. - Initialize all non-reserved RAM frames.
- Print the result.
The reserved kernel range extends past _kernel_end to mem.end_ptr(). If we only reserved up to _kernel_end, the allocator could give away pages containing its own frame metadata.
The pre-kernel reservation protects firmware. On QEMU virt, OpenSBI occupies RAM below the kernel load address (0x8020_0000), and the DTB may also advertise firmware/device ranges through memreserve entries or /reserved-memory nodes. The allocator must not hand those pages to the kernel.
Update the Entry Point
Section titled “Update the Entry Point”Replace src/main.rs:
#![no_std]#![no_main]
#[macro_use]mod utils;mod boot;mod memory;
core::arch::global_asm!(include_str!("../boot.S"));
#[unsafe(no_mangle)]pub extern "C" fn kernel_main(hartid: usize, dtb_ptr: usize) -> ! { boot::init(hartid, dtb_ptr);
loop { unsafe { core::arch::asm!("wfi") } }}#[macro_use] mod utils imports the println! and print! macros into the crate root so every module can use them.
Run It
Section titled “Run It”cargo run --releaseOpenSBI prints its boot banner, then the kernel output follows:
jikei: booted under OpenSBIKernel: 0x80200000 - 0x8020c000 (48 KB) .text: 8 KB, .rodata: 4 KB, .data: 4 KB, .bss: 28 KBCPUs: 1RAM: 0x80000000 - 0x90000000 (256 MB)Reserved: 0x80200000 - 0x8038c000Reserved: 0x8fe00000 - 0x8fe02000Reserved: 0x80000000 - 0x80200000Memory: 64000 free frames (250 MB), 65536 totalYour numbers will differ. The key facts to check:
- The kernel reports its own size from linker symbols.
- The device tree reveals one 256 MB RAM region.
- Most frames are free. The reserved frames cover the kernel image, allocator metadata, the device tree, firmware, and DTB-advertised reserved memory.
If you see kernel panic or no output after OpenSBI’s banner, check that _kernel_end is defined in link.ld and that Cargo.toml lists both dependencies.
Checkpoint
Section titled “Checkpoint”At this checkpoint, the system is:
OpenSBI -> _start -> boot::init -> parse device tree -> discover RAM -> init buddy allocator -> reserve kernel + DTB + firmware/reserved-memory ranges -> parkThe kernel now knows what memory exists and can hand out physical pages. It does not use them for anything yet — no page tables, no processes, no heap. That changes next.
What Comes Next
Section titled “What Comes Next”Physical frames are raw material. The next chapter builds Sv39 page tables from them: an identity map for the kernel, permission bits for W^X, and the foundation for isolated user address spaces.