Skip to content

8. IPC Endpoints

Processes can sleep, yield, and print — but they cannot talk to each other. This chapter adds the first explicitly microkernel-shaped mechanism: endpoints for synchronous IPC.

An endpoint is a rendezvous point. One process sends, another receives. If no partner is waiting, the caller blocks. When both sides meet, the kernel copies message registers from sender to receiver and both become runnable.

The send is synchronous from the sender’s perspective — control does not return until a receiver picks up the message — but the kernel does keep waiting senders (and their messages) on a queue inside the endpoint. There is no user-visible asynchronous send: a sender that has not yet been matched is BlockedOnEndpoint and not runnable.

created endpoint 0
spawned pid 0
spawned pid 1
hart 1 online
IP
A1
B1
A2
pid 0: exited
B2
pid 1: exited
all processes exited

Program B sends the bytes I and P through endpoint 0. Program A receives them and prints. The rest is sleep-based interleaving from Chapter 7. On multi-hart QEMU, secondary hart startup messages can interleave with process output, so IP may appear split by hart ... online lines.

FileStatus
src/endpoint.rsnew — endpoint table, send/receive rendezvous
src/sched/process.rsadd BlockedOnEndpoint, arg initialization in spawn_with_args, endpoint helpers
src/sched/syscall.rsadd SYS_EP_CREATE, SYS_EP_SEND, SYS_EP_RECV
src/sched/mod.rshalt with a distinct message if all live processes are endpoint-blocked
src/trap/mod.rsadd TrapFrame accessors: a1, a2, set_a1set_a3, set_args
src/demo.rsupdate: endpoint IPC demo, spawn_with_args
src/main.rsadd mod endpoint

The endpoint syscalls read two message words (a1, a2) and write up to four return values. Add accessors to src/trap/mod.rs in the impl TrapFrame block:

pub fn a1(&self) -> usize {
self.x[10]
}
pub fn a2(&self) -> usize {
self.x[11]
}
pub fn set_a1(&mut self, val: usize) {
self.x[10] = val;
}
pub fn set_a2(&mut self, val: usize) {
self.x[11] = val;
}
pub fn set_a3(&mut self, val: usize) {
self.x[12] = val;
}
pub fn set_args(&mut self, args: [usize; 4]) {
self.set_a0(args[0]);
self.set_a1(args[1]);
self.set_a2(args[2]);
self.set_a3(args[3]);
}

Also update TrapFrame::new to accept initial arguments:

impl TrapFrame {
pub fn new(entry: usize, sp: usize) -> Self {
let mut tf = TrapFrame {
sepc: entry,
sstatus: 1 << 5,
scause: 0,
stval: 0,
x: [0; 31],
kernel_satp: 0,
kernel_sp: 0,
kernel_trap_ra: 0,
};
tf.x[1] = sp;
tf
}

No change needed to new itself — set_args handles initial arguments after construction.

Create src/endpoint.rs:

use alloc::collections::VecDeque;
use alloc::vec::Vec;
use spin::Mutex;
#[derive(Clone, Copy)]
struct Sender {
pid: usize,
msg: [usize; 4],
}
struct Endpoint {
senders: VecDeque<Sender>,
receivers: VecDeque<usize>,
}
struct EndpointTable {
endpoints: Vec<Endpoint>,
}
pub(crate) enum EndpointResult {
Matched { peer_pid: usize, msg: [usize; 4] },
Blocked,
InvalidEndpoint,
}
static TABLE: Mutex<EndpointTable> = Mutex::new(EndpointTable::new());
impl EndpointTable {
const fn new() -> Self {
Self {
endpoints: Vec::new(),
}
}
fn create(&mut self) -> u32 {
let id = self.endpoints.len() as u32;
self.endpoints.push(Endpoint {
senders: VecDeque::new(),
receivers: VecDeque::new(),
});
id
}
fn send(
&mut self,
ep_id: u32,
sender_pid: usize,
msg: [usize; 4],
) -> EndpointResult {
let Some(ep) = self.endpoints.get_mut(ep_id as usize) else {
return EndpointResult::InvalidEndpoint;
};
if let Some(peer_pid) = ep.receivers.pop_front() {
EndpointResult::Matched { peer_pid, msg }
} else {
ep.senders.push_back(Sender {
pid: sender_pid,
msg,
});
EndpointResult::Blocked
}
}
fn receive(
&mut self,
ep_id: u32,
receiver_pid: usize,
) -> EndpointResult {
let Some(ep) = self.endpoints.get_mut(ep_id as usize) else {
return EndpointResult::InvalidEndpoint;
};
if let Some(sender) = ep.senders.pop_front() {
EndpointResult::Matched {
peer_pid: sender.pid,
msg: sender.msg,
}
} else {
ep.receivers.push_back(receiver_pid);
EndpointResult::Blocked
}
}
}
pub(crate) fn create() -> u32 {
TABLE.lock().create()
}
pub(crate) fn send(
ep_id: u32,
sender_pid: usize,
msg: [usize; 4],
) -> EndpointResult {
TABLE.lock().send(ep_id, sender_pid, msg)
}
pub(crate) fn receive(ep_id: u32, receiver_pid: usize) -> EndpointResult {
TABLE.lock().receive(ep_id, receiver_pid)
}

An endpoint has two FIFO queues: waiting senders and waiting receivers. In normal operation, at most one queue is non-empty — every send first tries to consume a receiver, and every receive first tries to consume a sender.

When a match happens:

  • send finds a waiting receiver → returns Matched with the receiver’s PID and the message.
  • receive finds a waiting sender → returns Matched with the sender’s PID and message.
  • No partner → the caller is enqueued and Blocked is returned.

The endpoint table only manages queues. It does not touch process state or trap frames — that responsibility stays with the process table. This separation keeps endpoint matching simple while letting the scheduler control state transitions.

Update src/sched/process.rs. The changes are scattered through the file, so here is the complete replacement:

use crate::endpoint::{self, EndpointResult};
use crate::memory::{PAGE_SIZE, alloc_frame, alloc_zeroed_frame};
use crate::paging::*;
use crate::trap::TrapFrame;
use alloc::vec::Vec;
use spin::Mutex;
const CODE_VA: usize = 0x10000;
const USER_STACK_BASE: usize = 0x0100_0000;
const USER_STACK_PAGES: usize = 4;
enum State {
Ready,
Running(usize),
Sleeping(u64),
BlockedOnEndpoint(u32),
Exited,
}
struct Process {
state: State,
tf: *mut TrapFrame,
satp: usize,
tf_va: usize,
}
unsafe impl Send for Process {}
#[derive(Clone, Copy)]
pub(crate) struct Runnable {
pub pid: usize,
pub tf: *mut TrapFrame,
pub satp: usize,
pub tf_va: usize,
}
pub(crate) enum Next {
Runnable(Runnable),
SleepUntil(u64),
Done,
/// All live processes are blocked on endpoints with no senders/receivers
/// to match them and no sleepers to wake.
Deadlocked,
Empty,
}
static TABLE: Mutex<ProcessTable> = Mutex::new(ProcessTable::new());
pub(crate) fn spawn_with_args(code: &[u8], args: [usize; 4]) -> usize {
assert!(code.len() <= PAGE_SIZE, "user program too large");
let code_pa = alloc_zeroed_frame().expect("oom");
let tf_pa = alloc_frame().expect("oom");
unsafe {
core::ptr::copy_nonoverlapping(
code.as_ptr(),
code_pa as *mut u8,
code.len(),
);
}
let tf_va: usize = TRAMPOLINE - PAGE_SIZE;
let pt = PageTable::alloc();
pt.map_at(CODE_VA, code_pa, PTE_U | PTE_R | PTE_X, 0);
let stack_top = crate::memory::map_stack(
pt,
USER_STACK_BASE,
0,
USER_STACK_PAGES,
PTE_U,
);
pt.map_trampoline();
pt.map_at(tf_va, tf_pa, PTE_R | PTE_W, 0);
let tf_ptr = tf_pa as *mut TrapFrame;
let mut tf = TrapFrame::new(CODE_VA, stack_top);
tf.set_args(args);
unsafe { tf_ptr.write(tf) };
let pid = TABLE.lock().add(tf_ptr, pt.satp(), tf_va);
println!("spawned pid {}", pid);
pid
}
pub(crate) fn pick_next(now: u64, hartid: usize) -> Next {
let mut table = TABLE.lock();
table.wake_sleepers(now);
if let Some(runnable) = table.pick_next(hartid) {
return Next::Runnable(runnable);
}
if let Some(deadline) = table.next_sleep_deadline() {
return Next::SleepUntil(deadline);
}
if table.all_exited() {
return Next::Done;
}
if table.all_blocked_on_endpoints() {
if hartid == crate::boot::boot_hartid() {
table.report_blocked();
}
return Next::Deadlocked;
}
Next::Empty
}
pub(crate) fn ready(pid: usize) {
TABLE.lock().ready(pid);
}
pub(crate) fn sleep_until(pid: usize, deadline: u64) {
TABLE.lock().sleep_until(pid, deadline);
}
pub(crate) fn exit(pid: usize) {
TABLE.lock().exit(pid);
}
pub(crate) fn send_endpoint(
pid: usize,
ep_id: u32,
msg: [usize; 4],
current_tf: &mut TrapFrame,
) {
let mut table = TABLE.lock();
match endpoint::send(ep_id, pid, msg) {
EndpointResult::Matched { peer_pid, msg } => {
current_tf.set_a0(0);
table.write_message(peer_pid, msg);
table.ready(peer_pid);
table.ready(pid);
}
EndpointResult::Blocked => {
table.block_on_endpoint(pid, ep_id);
}
EndpointResult::InvalidEndpoint => {
println!("pid {}: invalid endpoint {}", pid, ep_id);
table.exit(pid);
}
}
}
pub(crate) fn receive_endpoint(
pid: usize,
ep_id: u32,
current_tf: &mut TrapFrame,
) {
let mut table = TABLE.lock();
match endpoint::receive(ep_id, pid) {
EndpointResult::Matched { peer_pid, msg } => {
current_tf.set_args(msg);
table.write_send_status(peer_pid, 0);
table.ready(peer_pid);
table.ready(pid);
}
EndpointResult::Blocked => {
table.block_on_endpoint(pid, ep_id);
}
EndpointResult::InvalidEndpoint => {
println!("pid {}: invalid endpoint {}", pid, ep_id);
table.exit(pid);
}
}
}
impl Process {
fn new(tf: *mut TrapFrame, satp: usize, tf_va: usize) -> Self {
Self {
state: State::Ready,
tf,
satp,
tf_va,
}
}
fn to_runnable(&self, pid: usize) -> Runnable {
Runnable {
pid,
tf: self.tf,
satp: self.satp,
tf_va: self.tf_va,
}
}
fn sleeping_deadline(&self) -> Option<u64> {
match self.state {
State::Sleeping(deadline) => Some(deadline),
_ => None,
}
}
fn wake_if_due(&mut self, now: u64) {
if self
.sleeping_deadline()
.is_some_and(|deadline| now >= deadline)
{
self.state = State::Ready;
}
}
}
struct ProcessTable {
processes: Vec<Process>,
cursor: usize,
}
impl ProcessTable {
const fn new() -> Self {
Self {
processes: Vec::new(),
cursor: 0,
}
}
fn add(&mut self, tf: *mut TrapFrame, satp: usize, tf_va: usize) -> usize {
let pid = self.processes.len();
self.processes.push(Process::new(tf, satp, tf_va));
pid
}
fn wake_sleepers(&mut self, now: u64) {
for proc in &mut self.processes {
proc.wake_if_due(now);
}
}
fn pick_next(&mut self, hartid: usize) -> Option<Runnable> {
let len = self.processes.len();
(0..len).find_map(|_| {
let idx = self.cursor % len;
self.cursor = (idx + 1) % len;
let proc = &mut self.processes[idx];
if matches!(proc.state, State::Ready) {
proc.state = State::Running(hartid);
Some(proc.to_runnable(idx))
} else {
None
}
})
}
fn next_sleep_deadline(&self) -> Option<u64> {
self.processes
.iter()
.filter_map(Process::sleeping_deadline)
.min()
}
fn all_exited(&self) -> bool {
!self.processes.is_empty()
&& self
.processes
.iter()
.all(|proc| matches!(proc.state, State::Exited))
}
/// At least one process exists, and every live (non-exited) process is
/// parked on an endpoint. With no sleepers and nothing runnable, no
/// further progress is possible.
fn all_blocked_on_endpoints(&self) -> bool {
let mut any_live = false;
for proc in &self.processes {
match proc.state {
State::Exited => {}
State::BlockedOnEndpoint(_) => any_live = true,
_ => return false,
}
}
any_live
}
fn report_blocked(&self) {
println!("deadlock: all live processes blocked on endpoints");
for (pid, proc) in self.processes.iter().enumerate() {
if let State::BlockedOnEndpoint(ep) = proc.state {
println!(" pid {} blocked on endpoint {}", pid, ep);
}
}
}
fn ready(&mut self, pid: usize) {
self.processes[pid].state = State::Ready;
}
fn sleep_until(&mut self, pid: usize, deadline: u64) {
self.processes[pid].state = State::Sleeping(deadline);
}
fn exit(&mut self, pid: usize) {
self.processes[pid].state = State::Exited;
}
fn block_on_endpoint(&mut self, pid: usize, endpoint_id: u32) {
self.processes[pid].state = State::BlockedOnEndpoint(endpoint_id);
}
fn write_message(&mut self, pid: usize, msg: [usize; 4]) {
let tf = unsafe { &mut *self.processes[pid].tf };
tf.set_args(msg);
}
fn write_send_status(&mut self, pid: usize, status: usize) {
let tf = unsafe { &mut *self.processes[pid].tf };
tf.set_a0(status);
}
}

Key additions:

  • BlockedOnEndpoint(u32) — the process is waiting for a partner on a specific endpoint. The endpoint id is carried so the deadlock report can show which endpoint each blocked process is waiting on.
  • Deadlocked — a new Next variant. When every live process is BlockedOnEndpoint and nothing is runnable or sleeping, no message can ever arrive; reporting and halting beats spinning on wfi forever. Handle this separately from Done in src/sched/mod.rs so the boot hart prints an endpoint-deadlock halt message instead of all processes exited.
  • spawn_with_args now uses its args — sets initial a0a3 before the first sret. The demo passes the endpoint ID this way. It also switches the code page to alloc_zeroed_frame so any unused bytes decode to a deterministic illegal instruction instead of stale frame contents.
  • send_endpoint / receive_endpoint — hold the process table lock while calling endpoint::send/receive. This ensures that enqueuing a waiter and updating process state happen atomically. If the endpoint call finds a partner, both processes’ trap frames are updated and both become Ready. Invalid endpoint ids are reported and the offending process exits — no separate result type is needed.

The lock ordering is always process table first, endpoint table second. The endpoint module never takes the process table lock, so there is no deadlock risk.

pick_next can now return Next::Deadlocked, so update the scheduler match in src/sched/mod.rs and give halt an explicit message:

Next::SleepUntil(deadline) => sleep_until_timer(deadline),
Next::Done => halt(hartid == crate::boot::boot_hartid(), "all processes exited"),
Next::Deadlocked => halt(
hartid == crate::boot::boot_hartid(),
"system halted: endpoint deadlock",
),
Next::Empty => {

Then update halt:

fn halt(boot: bool, message: &str) -> ! {
sbi::set_timer(u64::MAX);
if boot {
println!("{}", message);
}
loop {
disable_interrupts();
unsafe { core::arch::asm!("wfi") };
}
}

The normal exit path still prints all processes exited; endpoint deadlock now gets a distinct final line.

Update src/sched/syscall.rs:

use crate::trap::TrapFrame;
use super::{process, timer};
const SYS_EXIT: usize = 1;
const SYS_SLEEP: usize = 2;
const SYS_YIELD: usize = 3;
const SYS_EP_CREATE: usize = 10;
const SYS_EP_SEND: usize = 11;
const SYS_EP_RECV: usize = 12;
const SYS_PUTCHAR: usize = 100;
pub(super) fn handle(pid: usize, tf: &mut TrapFrame) {
match tf.a7() {
SYS_EXIT => {
println!("pid {}: exited", pid);
process::exit(pid);
}
SYS_SLEEP => {
let ms = tf.a0() as u64;
process::sleep_until(
pid,
timer::rdtime() + ms * timer::TICKS_PER_MS,
);
}
SYS_YIELD => {
process::ready(pid);
}
SYS_EP_CREATE => {
let ep_id = crate::endpoint::create();
tf.set_a0(ep_id as usize);
process::ready(pid);
}
SYS_EP_SEND => {
let ep_id = tf.a0() as u32;
let msg = [tf.a1(), tf.a2(), 0, 0];
process::send_endpoint(pid, ep_id, msg, tf);
}
SYS_EP_RECV => {
let ep_id = tf.a0() as u32;
process::receive_endpoint(pid, ep_id, tf);
}
SYS_PUTCHAR => {
print!("{}", tf.a0() as u8 as char);
process::ready(pid);
}
nr => {
println!("pid {}: unknown syscall {nr}", pid);
process::exit(pid);
}
}
}

Three new syscalls:

  • SYS_EP_CREATE (10) — creates an endpoint, returns its ID in a0.
  • SYS_EP_SEND (11)a0 = endpoint ID, a1/a2 = two message words. Blocks if no receiver is waiting. Returns a0 = 0 after delivery.
  • SYS_EP_RECV (12)a0 = endpoint ID. Blocks if no sender is waiting. Returns message in a0/a1.

Each handler arm sets the next state for pid directly (ready, sleeping, exited, or BlockedOnEndpoint via the endpoint helpers) and returns. Send and receive may park the process, in which case pick_next will skip it until a partner arrives. Invalid endpoint ids are logged inside send_endpoint/receive_endpoint, which then mark the offending process Exited.

Update src/demo.rs:

use crate::sched::process;
pub fn spawn() {
let ep_id = crate::endpoint::create();
println!("created endpoint {}", ep_id);
let args = [ep_id as usize, 0, 0, 0];
process::spawn_with_args(
prog_bytes(_prog_a_start, _prog_a_end),
args,
);
process::spawn_with_args(
prog_bytes(_prog_b_start, _prog_b_end),
args,
);
}
fn prog_bytes(
start: unsafe extern "C" fn() -> u8,
end: unsafe extern "C" fn() -> u8,
) -> &'static [u8] {
let s = start as usize;
let len = end as usize - s;
unsafe { core::slice::from_raw_parts(s as *const u8, len) }
}
unsafe extern "C" {
fn _prog_a_start() -> u8;
fn _prog_a_end() -> u8;
fn _prog_b_start() -> u8;
fn _prog_b_end() -> u8;
}
core::arch::global_asm!(
r#"
.pushsection .rodata.user_prog, "a"
.globl _prog_a_start
_prog_a_start:
mv s5, a0 # save endpoint ID
li a7, 12 # SYS_EP_RECV
mv a0, s5
ecall
mv s3, a0 # received msg[0]
mv s4, a1 # received msg[1]
li a7, 100 # print received chars
mv a0, s3
ecall
li a7, 100
mv a0, s4
ecall
li a7, 100
li a0, '\n'
ecall
li a7, 100
li a0, 'A'
ecall
li a7, 100
li a0, '1'
ecall
li a7, 100
li a0, '\n'
ecall
li a7, 2
li a0, 100
ecall
li a7, 100
li a0, 'A'
ecall
li a7, 100
li a0, '2'
ecall
li a7, 100
li a0, '\n'
ecall
li a7, 1
ecall
.globl _prog_a_end
_prog_a_end:
.globl _prog_b_start
_prog_b_start:
mv s5, a0 # save endpoint ID
li a7, 11 # SYS_EP_SEND
mv a0, s5
li a1, 'I'
li a2, 'P'
ecall
li a7, 2 # sleep 20ms (let receiver print first)
li a0, 20
ecall
li a7, 100
li a0, 'B'
ecall
li a7, 100
li a0, '1'
ecall
li a7, 100
li a0, '\n'
ecall
li a7, 2
li a0, 200
ecall
li a7, 100
li a0, 'B'
ecall
li a7, 100
li a0, '2'
ecall
li a7, 100
li a0, '\n'
ecall
li a7, 1
ecall
.globl _prog_b_end
_prog_b_end:
.popsection
"#
);

The demo creates one endpoint and passes its ID to both processes as a0:

  • Program A receives on the endpoint (blocks until B sends), then prints the two received bytes (I and P). Continues with print/sleep/exit.
  • Program B sends 'I' and 'P' on the endpoint, sleeps 20 ms to let A print the received message first, then continues with print/sleep/exit.

spawn_with_args places the endpoint ID in a0 before the first sret. The user programs save it to s5 (a callee-saved register) on entry.

Add mod endpoint:

#![no_std]
#![no_main]
extern crate alloc;
#[macro_use]
mod utils;
mod boot;
mod demo;
mod endpoint;
mod memory;
mod paging;
mod sched;
mod trap;
core::arch::global_asm!(include_str!("../boot.S"));
#[unsafe(no_mangle)]
pub extern "C" fn kernel_main(hartid: usize, dtb_ptr: usize) -> ! {
let stack_top = boot::init(hartid, dtb_ptr);
unsafe { _switch_to_stack(stack_top, kernel_main_on_stack, hartid) }
}
unsafe extern "C" {
fn _switch_to_stack(
stack_top: usize,
entry: extern "C" fn(usize) -> !,
arg0: usize,
) -> !;
}
extern "C" fn kernel_main_on_stack(hartid: usize) -> ! {
trap::init_hart();
demo::spawn();
boot::start_secondary_harts();
sched::run(hartid)
}
Terminal window
cargo run --release
created endpoint 0
spawned pid 0
spawned pid 1
hart 1 online
IP
A1
B1
A2
pid 0: exited
B2
pid 1: exited
all processes exited

On multi-hart QEMU, this output is intentionally not byte-for-byte stable: secondary hart startup messages can interleave with process putchar syscalls. The durable check is that both I and P are delivered before both processes exit.

The IPC sequence:

  1. An endpoint is created (ID 0) and passed to both processes.
  2. Program A calls SYS_EP_RECV — no sender is waiting, so A blocks.
  3. Program B calls SYS_EP_SEND with message ['I', 'P'] — A is waiting, so the rendezvous succeeds: B gets a0 = 0, A gets a0 = 'I' and a1 = 'P'. Both become Ready.
  4. A prints “IP\n”. B sleeps 20 ms, then prints “B1\n”.
  5. The rest proceeds as in Chapter 7.

If the IP line does not appear, check that spawn_with_args calls tf.set_args(args) and that the user program saves a0 to a callee-saved register before the first ecall.

OpenSBI
-> boot::init (memory, paging, heap, stack)
-> kernel_main_on_stack
-> trap::init_hart
-> endpoint::create (endpoint 0)
-> spawn pid 0 (A, receives on ep 0)
-> spawn pid 1 (B, sends on ep 0)
-> start_secondary_harts
-> sched::run
-> A calls EP_RECV -> blocks
-> B calls EP_SEND -> matches A
-> A gets message in a0/a1
-> B gets status in a0
-> both Ready
-> normal print/sleep/exit cycle
-> all exited -> halt

The kernel now supports synchronous IPC between processes. Endpoints are the foundation of microkernel design: instead of building every service into the kernel, user-space servers can provide services through endpoints.

Any process can currently send to any endpoint by its integer ID. Chapter 9 explores capabilities — unforgeable tokens that gate access to kernel objects. Instead of passing a raw endpoint ID, a process would hold a capability that grants specific permissions on a specific endpoint.