8. IPC Endpoints
Processes can sleep, yield, and print — but they cannot talk to each other. This chapter adds the first explicitly microkernel-shaped mechanism: endpoints for synchronous IPC.
An endpoint is a rendezvous point. One process sends, another receives. If no partner is waiting, the caller blocks. When both sides meet, the kernel copies message registers from sender to receiver and both become runnable.
The send is synchronous from the sender’s perspective — control does not return until a receiver picks up the message — but the kernel does keep waiting senders (and their messages) on a queue inside the endpoint. There is no user-visible asynchronous send: a sender that has not yet been matched is BlockedOnEndpoint and not runnable.
created endpoint 0spawned pid 0spawned pid 1hart 1 onlineIPA1B1A2pid 0: exitedB2pid 1: exitedall processes exitedProgram B sends the bytes I and P through endpoint 0. Program A receives them and prints. The rest is sleep-based interleaving from Chapter 7. On multi-hart QEMU, secondary hart startup messages can interleave with process output, so IP may appear split by hart ... online lines.
What Changes
Section titled “What Changes”| File | Status |
|---|---|
src/endpoint.rs | new — endpoint table, send/receive rendezvous |
src/sched/process.rs | add BlockedOnEndpoint, arg initialization in spawn_with_args, endpoint helpers |
src/sched/syscall.rs | add SYS_EP_CREATE, SYS_EP_SEND, SYS_EP_RECV |
src/sched/mod.rs | halt with a distinct message if all live processes are endpoint-blocked |
src/trap/mod.rs | add TrapFrame accessors: a1, a2, set_a1–set_a3, set_args |
src/demo.rs | update: endpoint IPC demo, spawn_with_args |
src/main.rs | add mod endpoint |
Extend the TrapFrame
Section titled “Extend the TrapFrame”The endpoint syscalls read two message words (a1, a2) and write up to four return values. Add accessors to src/trap/mod.rs in the impl TrapFrame block:
pub fn a1(&self) -> usize { self.x[10] } pub fn a2(&self) -> usize { self.x[11] }
pub fn set_a1(&mut self, val: usize) { self.x[10] = val; } pub fn set_a2(&mut self, val: usize) { self.x[11] = val; } pub fn set_a3(&mut self, val: usize) { self.x[12] = val; }
pub fn set_args(&mut self, args: [usize; 4]) { self.set_a0(args[0]); self.set_a1(args[1]); self.set_a2(args[2]); self.set_a3(args[3]); }Also update TrapFrame::new to accept initial arguments:
impl TrapFrame { pub fn new(entry: usize, sp: usize) -> Self { let mut tf = TrapFrame { sepc: entry, sstatus: 1 << 5, scause: 0, stval: 0, x: [0; 31], kernel_satp: 0, kernel_sp: 0, kernel_trap_ra: 0, }; tf.x[1] = sp; tf }No change needed to new itself — set_args handles initial arguments after construction.
The Endpoint Table
Section titled “The Endpoint Table”Create src/endpoint.rs:
use alloc::collections::VecDeque;use alloc::vec::Vec;use spin::Mutex;
#[derive(Clone, Copy)]struct Sender { pid: usize, msg: [usize; 4],}
struct Endpoint { senders: VecDeque<Sender>, receivers: VecDeque<usize>,}
struct EndpointTable { endpoints: Vec<Endpoint>,}
pub(crate) enum EndpointResult { Matched { peer_pid: usize, msg: [usize; 4] }, Blocked, InvalidEndpoint,}
static TABLE: Mutex<EndpointTable> = Mutex::new(EndpointTable::new());
impl EndpointTable { const fn new() -> Self { Self { endpoints: Vec::new(), } }
fn create(&mut self) -> u32 { let id = self.endpoints.len() as u32; self.endpoints.push(Endpoint { senders: VecDeque::new(), receivers: VecDeque::new(), }); id }
fn send( &mut self, ep_id: u32, sender_pid: usize, msg: [usize; 4], ) -> EndpointResult { let Some(ep) = self.endpoints.get_mut(ep_id as usize) else { return EndpointResult::InvalidEndpoint; };
if let Some(peer_pid) = ep.receivers.pop_front() { EndpointResult::Matched { peer_pid, msg } } else { ep.senders.push_back(Sender { pid: sender_pid, msg, }); EndpointResult::Blocked } }
fn receive( &mut self, ep_id: u32, receiver_pid: usize, ) -> EndpointResult { let Some(ep) = self.endpoints.get_mut(ep_id as usize) else { return EndpointResult::InvalidEndpoint; };
if let Some(sender) = ep.senders.pop_front() { EndpointResult::Matched { peer_pid: sender.pid, msg: sender.msg, } } else { ep.receivers.push_back(receiver_pid); EndpointResult::Blocked } }}
pub(crate) fn create() -> u32 { TABLE.lock().create()}
pub(crate) fn send( ep_id: u32, sender_pid: usize, msg: [usize; 4],) -> EndpointResult { TABLE.lock().send(ep_id, sender_pid, msg)}
pub(crate) fn receive(ep_id: u32, receiver_pid: usize) -> EndpointResult { TABLE.lock().receive(ep_id, receiver_pid)}An endpoint has two FIFO queues: waiting senders and waiting receivers. In normal operation, at most one queue is non-empty — every send first tries to consume a receiver, and every receive first tries to consume a sender.
When a match happens:
sendfinds a waiting receiver → returnsMatchedwith the receiver’s PID and the message.receivefinds a waiting sender → returnsMatchedwith the sender’s PID and message.- No partner → the caller is enqueued and
Blockedis returned.
The endpoint table only manages queues. It does not touch process state or trap frames — that responsibility stays with the process table. This separation keeps endpoint matching simple while letting the scheduler control state transitions.
Add BlockedOnEndpoint
Section titled “Add BlockedOnEndpoint”Update src/sched/process.rs. The changes are scattered through the file, so here is the complete replacement:
use crate::endpoint::{self, EndpointResult};use crate::memory::{PAGE_SIZE, alloc_frame, alloc_zeroed_frame};use crate::paging::*;use crate::trap::TrapFrame;use alloc::vec::Vec;use spin::Mutex;
const CODE_VA: usize = 0x10000;const USER_STACK_BASE: usize = 0x0100_0000;const USER_STACK_PAGES: usize = 4;
enum State { Ready, Running(usize), Sleeping(u64), BlockedOnEndpoint(u32), Exited,}
struct Process { state: State, tf: *mut TrapFrame, satp: usize, tf_va: usize,}
unsafe impl Send for Process {}
#[derive(Clone, Copy)]pub(crate) struct Runnable { pub pid: usize, pub tf: *mut TrapFrame, pub satp: usize, pub tf_va: usize,}
pub(crate) enum Next { Runnable(Runnable), SleepUntil(u64), Done, /// All live processes are blocked on endpoints with no senders/receivers /// to match them and no sleepers to wake. Deadlocked, Empty,}
static TABLE: Mutex<ProcessTable> = Mutex::new(ProcessTable::new());
pub(crate) fn spawn_with_args(code: &[u8], args: [usize; 4]) -> usize { assert!(code.len() <= PAGE_SIZE, "user program too large");
let code_pa = alloc_zeroed_frame().expect("oom"); let tf_pa = alloc_frame().expect("oom");
unsafe { core::ptr::copy_nonoverlapping( code.as_ptr(), code_pa as *mut u8, code.len(), ); }
let tf_va: usize = TRAMPOLINE - PAGE_SIZE;
let pt = PageTable::alloc(); pt.map_at(CODE_VA, code_pa, PTE_U | PTE_R | PTE_X, 0); let stack_top = crate::memory::map_stack( pt, USER_STACK_BASE, 0, USER_STACK_PAGES, PTE_U, ); pt.map_trampoline(); pt.map_at(tf_va, tf_pa, PTE_R | PTE_W, 0);
let tf_ptr = tf_pa as *mut TrapFrame; let mut tf = TrapFrame::new(CODE_VA, stack_top); tf.set_args(args); unsafe { tf_ptr.write(tf) };
let pid = TABLE.lock().add(tf_ptr, pt.satp(), tf_va); println!("spawned pid {}", pid); pid}
pub(crate) fn pick_next(now: u64, hartid: usize) -> Next { let mut table = TABLE.lock(); table.wake_sleepers(now);
if let Some(runnable) = table.pick_next(hartid) { return Next::Runnable(runnable); } if let Some(deadline) = table.next_sleep_deadline() { return Next::SleepUntil(deadline); } if table.all_exited() { return Next::Done; } if table.all_blocked_on_endpoints() { if hartid == crate::boot::boot_hartid() { table.report_blocked(); } return Next::Deadlocked; } Next::Empty}
pub(crate) fn ready(pid: usize) { TABLE.lock().ready(pid);}
pub(crate) fn sleep_until(pid: usize, deadline: u64) { TABLE.lock().sleep_until(pid, deadline);}
pub(crate) fn exit(pid: usize) { TABLE.lock().exit(pid);}
pub(crate) fn send_endpoint( pid: usize, ep_id: u32, msg: [usize; 4], current_tf: &mut TrapFrame,) { let mut table = TABLE.lock(); match endpoint::send(ep_id, pid, msg) { EndpointResult::Matched { peer_pid, msg } => { current_tf.set_a0(0); table.write_message(peer_pid, msg); table.ready(peer_pid); table.ready(pid); } EndpointResult::Blocked => { table.block_on_endpoint(pid, ep_id); } EndpointResult::InvalidEndpoint => { println!("pid {}: invalid endpoint {}", pid, ep_id); table.exit(pid); } }}
pub(crate) fn receive_endpoint( pid: usize, ep_id: u32, current_tf: &mut TrapFrame,) { let mut table = TABLE.lock(); match endpoint::receive(ep_id, pid) { EndpointResult::Matched { peer_pid, msg } => { current_tf.set_args(msg); table.write_send_status(peer_pid, 0); table.ready(peer_pid); table.ready(pid); } EndpointResult::Blocked => { table.block_on_endpoint(pid, ep_id); } EndpointResult::InvalidEndpoint => { println!("pid {}: invalid endpoint {}", pid, ep_id); table.exit(pid); } }}
impl Process { fn new(tf: *mut TrapFrame, satp: usize, tf_va: usize) -> Self { Self { state: State::Ready, tf, satp, tf_va, } }
fn to_runnable(&self, pid: usize) -> Runnable { Runnable { pid, tf: self.tf, satp: self.satp, tf_va: self.tf_va, } }
fn sleeping_deadline(&self) -> Option<u64> { match self.state { State::Sleeping(deadline) => Some(deadline), _ => None, } }
fn wake_if_due(&mut self, now: u64) { if self .sleeping_deadline() .is_some_and(|deadline| now >= deadline) { self.state = State::Ready; } }}
struct ProcessTable { processes: Vec<Process>, cursor: usize,}
impl ProcessTable { const fn new() -> Self { Self { processes: Vec::new(), cursor: 0, } }
fn add(&mut self, tf: *mut TrapFrame, satp: usize, tf_va: usize) -> usize { let pid = self.processes.len(); self.processes.push(Process::new(tf, satp, tf_va)); pid }
fn wake_sleepers(&mut self, now: u64) { for proc in &mut self.processes { proc.wake_if_due(now); } }
fn pick_next(&mut self, hartid: usize) -> Option<Runnable> { let len = self.processes.len(); (0..len).find_map(|_| { let idx = self.cursor % len; self.cursor = (idx + 1) % len; let proc = &mut self.processes[idx]; if matches!(proc.state, State::Ready) { proc.state = State::Running(hartid); Some(proc.to_runnable(idx)) } else { None } }) }
fn next_sleep_deadline(&self) -> Option<u64> { self.processes .iter() .filter_map(Process::sleeping_deadline) .min() }
fn all_exited(&self) -> bool { !self.processes.is_empty() && self .processes .iter() .all(|proc| matches!(proc.state, State::Exited)) }
/// At least one process exists, and every live (non-exited) process is /// parked on an endpoint. With no sleepers and nothing runnable, no /// further progress is possible. fn all_blocked_on_endpoints(&self) -> bool { let mut any_live = false; for proc in &self.processes { match proc.state { State::Exited => {} State::BlockedOnEndpoint(_) => any_live = true, _ => return false, } } any_live }
fn report_blocked(&self) { println!("deadlock: all live processes blocked on endpoints"); for (pid, proc) in self.processes.iter().enumerate() { if let State::BlockedOnEndpoint(ep) = proc.state { println!(" pid {} blocked on endpoint {}", pid, ep); } } }
fn ready(&mut self, pid: usize) { self.processes[pid].state = State::Ready; }
fn sleep_until(&mut self, pid: usize, deadline: u64) { self.processes[pid].state = State::Sleeping(deadline); }
fn exit(&mut self, pid: usize) { self.processes[pid].state = State::Exited; }
fn block_on_endpoint(&mut self, pid: usize, endpoint_id: u32) { self.processes[pid].state = State::BlockedOnEndpoint(endpoint_id); }
fn write_message(&mut self, pid: usize, msg: [usize; 4]) { let tf = unsafe { &mut *self.processes[pid].tf }; tf.set_args(msg); }
fn write_send_status(&mut self, pid: usize, status: usize) { let tf = unsafe { &mut *self.processes[pid].tf }; tf.set_a0(status); }}Key additions:
BlockedOnEndpoint(u32)— the process is waiting for a partner on a specific endpoint. The endpoint id is carried so the deadlock report can show which endpoint each blocked process is waiting on.Deadlocked— a newNextvariant. When every live process isBlockedOnEndpointand nothing is runnable or sleeping, no message can ever arrive; reporting and halting beats spinning onwfiforever. Handle this separately fromDoneinsrc/sched/mod.rsso the boot hart prints an endpoint-deadlock halt message instead ofall processes exited.spawn_with_argsnow uses its args — sets initiala0–a3before the firstsret. The demo passes the endpoint ID this way. It also switches the code page toalloc_zeroed_frameso any unused bytes decode to a deterministic illegal instruction instead of stale frame contents.send_endpoint/receive_endpoint— hold the process table lock while callingendpoint::send/receive. This ensures that enqueuing a waiter and updating process state happen atomically. If the endpoint call finds a partner, both processes’ trap frames are updated and both becomeReady. Invalid endpoint ids are reported and the offending process exits — no separate result type is needed.
The lock ordering is always process table first, endpoint table second. The endpoint module never takes the process table lock, so there is no deadlock risk.
Update the Scheduler Halt Path
Section titled “Update the Scheduler Halt Path”pick_next can now return Next::Deadlocked, so update the scheduler match in src/sched/mod.rs and give halt an explicit message:
Next::SleepUntil(deadline) => sleep_until_timer(deadline), Next::Done => halt(hartid == crate::boot::boot_hartid(), "all processes exited"), Next::Deadlocked => halt( hartid == crate::boot::boot_hartid(), "system halted: endpoint deadlock", ), Next::Empty => {Then update halt:
fn halt(boot: bool, message: &str) -> ! { sbi::set_timer(u64::MAX); if boot { println!("{}", message); } loop { disable_interrupts(); unsafe { core::arch::asm!("wfi") }; }}The normal exit path still prints all processes exited; endpoint deadlock now gets a distinct final line.
Endpoint Syscalls
Section titled “Endpoint Syscalls”Update src/sched/syscall.rs:
use crate::trap::TrapFrame;use super::{process, timer};
const SYS_EXIT: usize = 1;const SYS_SLEEP: usize = 2;const SYS_YIELD: usize = 3;const SYS_EP_CREATE: usize = 10;const SYS_EP_SEND: usize = 11;const SYS_EP_RECV: usize = 12;const SYS_PUTCHAR: usize = 100;
pub(super) fn handle(pid: usize, tf: &mut TrapFrame) { match tf.a7() { SYS_EXIT => { println!("pid {}: exited", pid); process::exit(pid); } SYS_SLEEP => { let ms = tf.a0() as u64; process::sleep_until( pid, timer::rdtime() + ms * timer::TICKS_PER_MS, ); } SYS_YIELD => { process::ready(pid); } SYS_EP_CREATE => { let ep_id = crate::endpoint::create(); tf.set_a0(ep_id as usize); process::ready(pid); } SYS_EP_SEND => { let ep_id = tf.a0() as u32; let msg = [tf.a1(), tf.a2(), 0, 0]; process::send_endpoint(pid, ep_id, msg, tf); } SYS_EP_RECV => { let ep_id = tf.a0() as u32; process::receive_endpoint(pid, ep_id, tf); } SYS_PUTCHAR => { print!("{}", tf.a0() as u8 as char); process::ready(pid); } nr => { println!("pid {}: unknown syscall {nr}", pid); process::exit(pid); } }}Three new syscalls:
- SYS_EP_CREATE (10) — creates an endpoint, returns its ID in
a0. - SYS_EP_SEND (11) —
a0= endpoint ID,a1/a2= two message words. Blocks if no receiver is waiting. Returnsa0 = 0after delivery. - SYS_EP_RECV (12) —
a0= endpoint ID. Blocks if no sender is waiting. Returns message ina0/a1.
Each handler arm sets the next state for pid directly (ready, sleeping, exited, or BlockedOnEndpoint via the endpoint helpers) and returns. Send and receive may park the process, in which case pick_next will skip it until a partner arrives. Invalid endpoint ids are logged inside send_endpoint/receive_endpoint, which then mark the offending process Exited.
Update the Demo
Section titled “Update the Demo”Update src/demo.rs:
use crate::sched::process;
pub fn spawn() { let ep_id = crate::endpoint::create(); println!("created endpoint {}", ep_id); let args = [ep_id as usize, 0, 0, 0];
process::spawn_with_args( prog_bytes(_prog_a_start, _prog_a_end), args, ); process::spawn_with_args( prog_bytes(_prog_b_start, _prog_b_end), args, );}
fn prog_bytes( start: unsafe extern "C" fn() -> u8, end: unsafe extern "C" fn() -> u8,) -> &'static [u8] { let s = start as usize; let len = end as usize - s; unsafe { core::slice::from_raw_parts(s as *const u8, len) }}
unsafe extern "C" { fn _prog_a_start() -> u8; fn _prog_a_end() -> u8; fn _prog_b_start() -> u8; fn _prog_b_end() -> u8;}
core::arch::global_asm!( r#" .pushsection .rodata.user_prog, "a"
.globl _prog_a_start _prog_a_start: mv s5, a0 # save endpoint ID
li a7, 12 # SYS_EP_RECV mv a0, s5 ecall mv s3, a0 # received msg[0] mv s4, a1 # received msg[1] li a7, 100 # print received chars mv a0, s3 ecall li a7, 100 mv a0, s4 ecall li a7, 100 li a0, '\n' ecall
li a7, 100 li a0, 'A' ecall li a7, 100 li a0, '1' ecall li a7, 100 li a0, '\n' ecall li a7, 2 li a0, 100 ecall li a7, 100 li a0, 'A' ecall li a7, 100 li a0, '2' ecall li a7, 100 li a0, '\n' ecall li a7, 1 ecall .globl _prog_a_end _prog_a_end:
.globl _prog_b_start _prog_b_start: mv s5, a0 # save endpoint ID
li a7, 11 # SYS_EP_SEND mv a0, s5 li a1, 'I' li a2, 'P' ecall li a7, 2 # sleep 20ms (let receiver print first) li a0, 20 ecall
li a7, 100 li a0, 'B' ecall li a7, 100 li a0, '1' ecall li a7, 100 li a0, '\n' ecall li a7, 2 li a0, 200 ecall li a7, 100 li a0, 'B' ecall li a7, 100 li a0, '2' ecall li a7, 100 li a0, '\n' ecall li a7, 1 ecall .globl _prog_b_end _prog_b_end:
.popsection "#);The demo creates one endpoint and passes its ID to both processes as a0:
- Program A receives on the endpoint (blocks until B sends), then prints the two received bytes (
IandP). Continues with print/sleep/exit. - Program B sends
'I'and'P'on the endpoint, sleeps 20 ms to let A print the received message first, then continues with print/sleep/exit.
spawn_with_args places the endpoint ID in a0 before the first sret. The user programs save it to s5 (a callee-saved register) on entry.
Update main.rs
Section titled “Update main.rs”Add mod endpoint:
#![no_std]#![no_main]
extern crate alloc;
#[macro_use]mod utils;mod boot;mod demo;mod endpoint;mod memory;mod paging;mod sched;mod trap;
core::arch::global_asm!(include_str!("../boot.S"));
#[unsafe(no_mangle)]pub extern "C" fn kernel_main(hartid: usize, dtb_ptr: usize) -> ! { let stack_top = boot::init(hartid, dtb_ptr); unsafe { _switch_to_stack(stack_top, kernel_main_on_stack, hartid) }}
unsafe extern "C" { fn _switch_to_stack( stack_top: usize, entry: extern "C" fn(usize) -> !, arg0: usize, ) -> !;}
extern "C" fn kernel_main_on_stack(hartid: usize) -> ! { trap::init_hart();
demo::spawn(); boot::start_secondary_harts(); sched::run(hartid)}Run It
Section titled “Run It”cargo run --releasecreated endpoint 0spawned pid 0spawned pid 1hart 1 onlineIPA1B1A2pid 0: exitedB2pid 1: exitedall processes exitedOn multi-hart QEMU, this output is intentionally not byte-for-byte stable: secondary hart startup messages can interleave with process putchar syscalls. The durable check is that both I and P are delivered before both processes exit.
The IPC sequence:
- An endpoint is created (ID 0) and passed to both processes.
- Program A calls
SYS_EP_RECV— no sender is waiting, so A blocks. - Program B calls
SYS_EP_SENDwith message['I', 'P']— A is waiting, so the rendezvous succeeds: B getsa0 = 0, A getsa0 = 'I'anda1 = 'P'. Both becomeReady. - A prints “IP\n”. B sleeps 20 ms, then prints “B1\n”.
- The rest proceeds as in Chapter 7.
If the IP line does not appear, check that spawn_with_args calls tf.set_args(args) and that the user program saves a0 to a callee-saved register before the first ecall.
Checkpoint
Section titled “Checkpoint”OpenSBI -> boot::init (memory, paging, heap, stack) -> kernel_main_on_stack -> trap::init_hart -> endpoint::create (endpoint 0) -> spawn pid 0 (A, receives on ep 0) -> spawn pid 1 (B, sends on ep 0) -> start_secondary_harts -> sched::run -> A calls EP_RECV -> blocks -> B calls EP_SEND -> matches A -> A gets message in a0/a1 -> B gets status in a0 -> both Ready -> normal print/sleep/exit cycle -> all exited -> haltThe kernel now supports synchronous IPC between processes. Endpoints are the foundation of microkernel design: instead of building every service into the kernel, user-space servers can provide services through endpoints.
What Comes Next
Section titled “What Comes Next”Any process can currently send to any endpoint by its integer ID. Chapter 9 explores capabilities — unforgeable tokens that gate access to kernel objects. Instead of passing a raw endpoint ID, a process would hold a capability that grants specific permissions on a specific endpoint.