OSprojectsbeginner

Build a Safe 'Process Roulette' Simulator to Learn OS Signals and Process Management

UUnknown

2026-03-01

9 min read

Build a sandboxed "Process Roulette" simulator to safely learn signals, fork, and process trees—no system crashes.

Hook: Learn OS signals and process trees without risking your system

Want hands-on practice with process management, signals, and forking—but terrified of running a "process roulette" tool that might crash your machine? You're not alone. Many students and teachers avoid chaos-driven learning because uncontrolled experiments can kill desktop services or corrupt state. This walkthrough shows how to build a safe, local "Process Roulette" simulator that mimics random process termination inside a confined sandbox. You'll learn fork, signal handling, process trees, monitoring, and modern sandboxing best practices used in 2026—without ever touching system-critical processes.

Quick summary (inverted pyramid)

Goal: Create a simulator that spawns a controlled process tree and randomly sends signals to simulated processes inside a sandbox.
Safety: Run inside a PID namespace or a rootless container, use seccomp and cgroups to limit effects, and test locally.
Languages: Python for rapid prototyping; C for low-level signal handling examples.
Tools: unshare/clone, Docker/Podman (rootless), eBPF tracing for observability, and standard POSIX syscalls.

Why this matters in 2026

By 2026, observability and safe sandboxing are mainstream in developer education. Rootless containers, better user-namespace support, and eBPF-based observability tools (bpftrace, tracee, Cilium/Hubble) let students experiment with OS internals without privileged access. This project bridges classic OS concepts taught in class—signals, forks, waitpid, session/process groups—with modern sandbox practices, preparing learners for real-world debugging and cloud-native environments.

Project overview: What you'll build

A worker program that installs signal handlers and simulates workload.
A controller that spawns a process tree (multiple workers) using fork (or subprocesses) and keeps a registry of PIDs.
A random "roulette" event loop that selects a target from the registry and sends a signal (SIGTERM, SIGKILL, SIGSTOP, etc.).
Monitoring utilities to observe the process tree safely (pstree, /proc, eBPF traces).
Sandboxing steps to ensure the roulette only affects the simulated processes.

Design principles

Never touch system PIDs. Work in a new PID namespace or container so PIDs in the simulation are isolated.
Limit capability and resources. Use seccomp to restrict syscalls, setrlimit to cap CPU/memory, and cgroups v2 for resource control.
Graceful handling. Make workers handle SIGTERM and write logs; demonstrate differences between SIGTERM and SIGKILL.
Observability. Use /proc, ps, and eBPF to show what happened after signals are sent.

Part A — A safe rapid prototype in Python (recommended for beginners)

Python makes it easy to prototype a controller and worker. The controller spawns worker processes via multiprocessing or subprocess, tracks PIDs, and issues kill(pid, signal). For safety, run the whole experiment inside a rootless container (Docker/Podman) or with Linux unshare --pid --fork if user namespaces are enabled.

worker.py (Python)

#!/usr/bin/env python3
import os
import signal
import time
import random

running = True

def handle_term(signum, frame):
    print(f"[worker {os.getpid()}] Received signal: {signum}")
    global running
    running = False

signal.signal(signal.SIGTERM, handle_term)
signal.signal(signal.SIGINT, handle_term)

print(f"[worker {os.getpid()}] Started")
# Simulate work loop
while running:
    work = random.random() * 0.2
    time.sleep(work)
    # Periodically print to show liveness
    if random.random() < 0.05:
        print(f"[worker {os.getpid()}] alive")

print(f"[worker {os.getpid()}] Exiting cleanly")

controller.py (Python)

#!/usr/bin/env python3
import subprocess
import os
import signal
import random
import time

NUM_WORKERS = 8
workers = []

# Spawn workers as separate processes
for i in range(NUM_WORKERS):
    p = subprocess.Popen(["python3", "worker.py"])  # spawn worker.py
    print(f"Spawned worker PID {p.pid}")
    workers.append(p)

try:
    while True:
        alive = [p for p in workers if p.poll() is None]
        if not alive:
            print("All workers exited")
            break
        # Randomly pick a worker and a signal
        target = random.choice(alive)
        sig = random.choice([signal.SIGTERM, signal.SIGSTOP, signal.SIGCONT])
        print(f"Controller sending {sig} to PID {target.pid}")
        os.kill(target.pid, sig)
        time.sleep(random.uniform(0.5, 2.0))
except KeyboardInterrupt:
    print("Controller interrupted, cleaning up")
finally:
    for p in workers:
        try:
            p.terminate()
        except Exception:
            pass
    for p in workers:
        p.wait()

Run the above inside a container or unshared PID namespace to ensure safety. Example with unshare:

unshare --fork --pid --mount-proc python3 controller.py

Part B — Low-level C example: fork, signal handlers, waitpid

For OS internals classes, writing the worker in C shows low-level behavior: SIGCHLD, reaping, and signal masks. Here's a compact example demonstrating fork, a signal handler, and reaping children.

roulette.c

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <sys/wait.h>
#include <time.h>

#define N 6
pid_t children[N];

void reap_children(int sig) {
    int status;
    pid_t pid;
    while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
        printf("[controller] reaped child %d\n", pid);
    }
}

int main(void) {
    signal(SIGCHLD, reap_children);
    srand(time(NULL));

    for (int i = 0; i < N; ++i) {
        pid_t pid = fork();
        if (pid == 0) {
            // child
            printf("[child %d] starting\n", getpid());
            while (1) {
                pause(); // wait for signals
            }
            return 0;
        } else if (pid > 0) {
            children[i] = pid;
        } else {
            perror("fork");
            exit(1);
        }
    }

    // Controller loop: randomly send signals
    for (int i = 0; i < 50; ++i) {
        int idx = rand() % N;
        pid_t target = children[idx];
        int sig = (rand() % 2) ? SIGTERM : SIGKILL;
        printf("[controller] sending %s to %d\n", (sig==SIGTERM)?"SIGTERM":"SIGKILL", target);
        kill(target, sig);
        sleep(1);
    }

    // Cleanup
    for (int i = 0; i < N; ++i) waitpid(children[i], NULL, 0);
    printf("[controller] done\n");
    return 0;
}

Compile and run inside a sandbox:

gcc -o roulette roulette.c
# Prefer a sandbox: podman run --rm -it -v "$PWD":/work -w /work --userns=keep-id registry/fedora:latest bash
# inside container: ./roulette

Sandboxing: How to keep this safe

Sandboxing is the most important part. Here are practical options (choose one):

Run inside a rootless container (Docker or Podman rootless). This isolates PIDs and filesystem access. Modern distros (2024–2026) improved rootless UX—Podman is a great option.
Use unshare for lightweight isolation: unshare --pid --fork --mount-proc python3 controller.py. This works on kernels with user namespace support and is convenient for labs.
Use a VM: For classes with mixed OSes or strict policies, a disposable VM (Multipass, Vagrant) is safe and simple.
Apply seccomp and cgroups v2: Restrict syscalls (seccomp) and CPU/memory (cgroups) to avoid runaway processes—pluggable in containers or with libseccomp.

Example: unshare command (quick)

sudo unshare --fork --pid --mount-proc --map-root-user --root /tmp/simroot/ bash
# inside new namespace: run your controller and workers

Note: using --map-root-user requires support for user namespaces and may vary. Rootless Podman avoids sudo and is preferred for student environments.

Monitoring and observability (practical exercises)

Students should learn to observe what happens when signals are sent. Try these exercises:

Use pstree -p and ps aux --forest to inspect the process tree inside the sandbox.
Read /proc/<pid>/status to inspect state (e.g., Tracing PID, voluntary_ctxt_switches).
Use strace -p <pid> on a worker for a short time to see syscalls when signal-handling code runs (inside sandbox only).
Use eBPF tools (bpftrace or tracee) to show signal delivery events without affecting the processes. Example bpftrace snippet for signals:
```
tracepoint:signal:signal_generate
/args->sig == SIGTERM/ { printf("PID %d sent SIGTERM to %d\n", pid, args->tpid); }
```

Teaching exercises and learning outcomes

Here are ready-to-run exercises you can give students:

Compare SIGTERM vs SIGKILL: Modify the worker to write to disk on SIGTERM but not SIGKILL. Observe whether writes happen.
Zombie processes: Kill the controller and see how children behave. Then demonstrate reparenting to init (PID 1 in namespace) and correct reaping.
Session and process groups: Have the controller create a new session with setsid and send signals to a process group.
Resource limits: Setrlimit RLIMIT_CPU to 1s on workers and observe signals from the kernel (SIGXCPU).

Advanced strategies and 2026 trends

As of 2026, several trends make this project more powerful and relevant:

eBPF-first observability: Teachers can show signal delivery and syscall patterns with eBPF without instrumenting student code.
Rootless sandboxing: Rootless Podman and improved user namespace support make safe experimentation accessible to undergrads.
Containers as learning units: Many curricula now ship reproducible exercises as lightweight containers that include monitoring tools preinstalled.
Policy-based safety: Combine seccomp filters and cgroups v2 to enforce lab policies even when users have container shells.

Common pitfalls and troubleshooting

Zombies: If you see defunct processes, show how waitpid and a SIGCHLD handler (or wait in controller) prevents accumulation.
Unprivileged namespace errors: If unshare fails, fallback to a rootless Podman container or VM; kernel config might disable unprivileged user namespaces.
PID confusion: Inside a PID namespace, PIDs restart at 1—remind students to interpret PIDs relative to the namespace.
Accidentally killing the wrong process: Always run in a sandbox and avoid scripts that parse ps output carelessly—use tracked subprocess objects or saved PIDs from fork.

Example lesson plan (50–90 minute class)

10 min: Brief recap of signals and process states.
15 min: Instructor demo—run controller inside a container and show pstree.
25 min: Students implement a signal handler in the worker and change behavior for SIGTERM/SIGINT.
20 min: Exercises—use eBPF to observe signal delivery, implement reaper logic in controller, and experiment with SIGSTOP/SIGCONT.
Wrap-up: Discuss kernel-level behavior and modern sandboxing strategies (5–10 min).

Actionable takeaways

Always sandbox experiments that send signals: use rootless containers or unshare.
Teach both graceful shutdown (SIGTERM) and hard kill (SIGKILL) semantics—students need to see the difference.
Use small, instrumented worker programs that print state on signal receipt for clarity.
Leverage modern tools (eBPF) for non-intrusive observability while keeping the kernel and system intact.

Final notes: ethical and classroom safety

Teaching experiments that intentionally kill processes requires ethical consideration. Always make exercises reproducible and reversible. Don't encourage students to run process-killing tools on multi-user systems, shared servers, or production machines. Provide clear instructions for cleanup, and prefer ephemeral environments (containers/VMs).

Call to action

Ready to try it? Clone the starter repo (worker/controller examples), run it in a rootless Podman or unshare sandbox, and post your observations. Share students' logs, tricky bugs, or an improved seccomp profile in the classroom forum—I'll review and suggest improvements. If you want a ready-made lab pack with exercises and automated sandbox setup for your course, sign up to get the free lab kit and instructor notes.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.