THE JUNKYARD
ASSEMBLING DEBRIS . . .
0%
< BackPi-Monitor_

Lightweight System Monitoring for Pi

A Rust-based monitoring agent that reads /proc metrics and serves them via Prometheus endpoints, JSON API, and a live web dashboard

741
KB Binary Size
5
Metric Types
4
HTTP Endpoints
0
Dependencies on Pi
View Source on GitHubRead Full Documentation
$ curl http://10.0.0.111:9100/health
ok
$ curl http://10.0.0.111:9100/metrics
# HELP pi_cpu_usage_percent CPU usage percentage
# TYPE pi_cpu_usage_percent gauge
pi_cpu_usage_percent{cpu="total",mode="user"} 2.5
pi_cpu_usage_percent{cpu="total",mode="idle"} 95.3
...
$ _

Content

  1. 01Project Overview
  2. 02Architecture & Data Flow
  3. 03Metrics Collection
  4. 04Live Web Dashboard
  5. 05Grafana Integration
  6. 06Build & Deploy Pipeline
  7. 07Debugging Adventures
  8. 08Key Learnings
  9. 09Full Documentation (Notion)

Project Overview

Pi-Monitor is a lightweight system monitoring agent written in Rust, designed to run on the same Raspberry Pi 3A+ that hosts our custom RustPi Linux distribution. It reads system metrics directly from /proc and /sys, serves them in multiple formats, and includes a built-in web dashboard — all in a single 741KB static binary.

This project is completely separate from RustPi — it has its own repository, its own build pipeline, and its own Vagrant VM. The binary is deployed via SCP and runs as a standalone process. RustPi stays untouched.

📟

Live Dashboard

Built-in web UI with CPU graphs, memory rings, and click-to-expand modals

📊

Prometheus Native

Standard /metrics endpoint compatible with any Prometheus scraper

🔧

Zero Dependencies

Single static musl binary — just SCP it to the Pi and run

📈

Grafana Ready

Pre-built Grafana dashboard with Tron-themed aquamarine visuals

Endpoints

GET /Live web dashboard with auto-refresh
GET /metricsPrometheus exposition format
GET /jsonAll metrics as JSON
GET /healthLiveness check (returns "ok")

Architecture & Data Flow

Data flows one direction: the kernel exposes raw counters in /proc → our Rust code parses those text files into structured data → the HTTP server formats it for whichever endpoint was requested.

HTTP Servertokio + hyper (port 9100)
Route Handlers/metrics, /json, /health, /
Prometheus FormatterExposition format output
Metrics Collectors5 modules parsing /proc
Linux Kernel/proc & /sys virtual filesystems
HardwarePi 3A+ (BCM2837)

The CPU Sampling Problem

Every metric except CPU is a point-in-time snapshot — read /proc/meminfo and you get current memory usage. CPU is different: /proc/stat gives cumulative tick counts since boot. To get a percentage, you need two readings separated by a time interval, compute the difference, and calculate what fraction of elapsed ticks were spent in each state.

This is solved with a background tokio task that samples /proc/stat every 2 seconds and stores the computed percentages in an Arc<Mutex<CpuMetrics>>. HTTP handlers read the latest values without blocking.

Metrics Collection

Five collectors, each parsing a different /proc file format:

/proc/stat
cpu.rs
CPU ticks per core → usage percentages via background sampling
/proc/meminfo
memory.rs
Key-value pairs → total, used, free, available, buffers, cached
/proc/net/dev
network.rs
Whitespace table → rx/tx bytes, packets, errors per interface
/proc/loadavg
system.rs
Load averages, process count, uptime
statfs() syscall
disk.rs
Disk total/used/free per mount point

CPU Usage Algorithm

rust
/// Compare two /proc/stat snapshots and compute percentages
pub fn calculate_usage(prev: &[RawCpuCounters], curr: &[RawCpuCounters]) -> CpuMetrics {
    let user_diff = curr.user.saturating_sub(prev.user);
    let system_diff = curr.system.saturating_sub(prev.system);
    let idle_diff = curr.idle.saturating_sub(prev.idle);
    let total_diff = user_diff + system_diff + idle_diff + ...;
    
    CpuUsage {
        user_percent: (user_diff as f64 / total_diff as f64) * 100.0,
        idle_percent: (idle_diff as f64 / total_diff as f64) * 100.0,
        ...
    }
}

Live Web Dashboard

The dashboard is a single HTML page embedded in the binary at compile time using Rust's include_str!() macro. It polls /json every 2 seconds and renders all metrics with canvas-drawn graphs and interactive modals.

http://10.0.0.111:9100
◇ CPU3.2%
cpu0
2.1%
cpu1
5.3%
cpu2
1%
cpu3
4.5%
◇ Memory8.2%
used34.0 MB
total416 MB
◇ System
↑ 2d 14h 32m
◇ Network
◆ eth0
rx1.2 MB
tx45.3 KB
◇ Disk
/35.4%
/tmp0.1%
/run0.5%

Click any card to expand — modals show detailed breakdowns with larger fonts and more data

Grafana Integration

Since Pi-Monitor speaks native Prometheus format, it integrates directly with Prometheus + Grafana for historical monitoring with time-range queries, alerting, and all the features a production monitoring stack provides.

◆ Pi-Monitor // RustPi — Grafana
◇ CPU
3.2%
◇ MEMORY
8.2%
◇ DISK /
35.4%
◇ UPTIME
2d 14h
◇ LOAD
0.12
◇ PROCS
105
◇ CPU USAGE
cpu0
cpu1
cpu2
cpu3

PromQL Queries

promql
# CPU usage (excluding idle)
100 - pi_cpu_usage_percent{cpu="total", mode="idle"}

# Memory usage percentage
pi_memory_used_bytes / pi_memory_total_bytes * 100

# Network throughput (bytes/sec)
rate(pi_network_receive_bytes_total{interface="eth0"}[1m])

Build & Deploy Pipeline

The entire workflow is scripted — build in a Vagrant VM, deploy via SCP, auto-start on login.

bash
# Build the static binary (inside Vagrant VM)
./scripts/build.sh
# Output: 741KB static aarch64 musl binary

# Deploy to Pi (stops old instance, uploads, starts)
PI_HOST=10.0.0.111 ./scripts/deploy.sh

# Setup Prometheus + Grafana on Mac
./scripts/monitoring-setup.sh 10.0.0.111

Release Optimizations

toml
[profile.release]
opt-level = "s"      # Optimize for binary size
lto = true           # Link-time optimization
codegen-units = 1    # Better optimization, slower build
strip = true         # Remove debug symbols
panic = "abort"      # No unwinding machinery

Debugging Adventures

Lessons learned while building and deploying Pi-Monitor:

Symptom: curl: (7) Failed to connect to localhost port 9100: Connection refused

Cause: The CPU background sampler needs 2 seconds for its first tick before the server is fully ready.

Solution: Wait a moment after starting, or add a readiness delay. Harmless race condition.

Key Learnings

Technical Skills Gained

Async Rust (tokio)HTTP Servers (hyper)Prometheus Format/proc FilesystemArc<Mutex<T>>musl Static LinkingCanvas RenderingGrafana DashboardsPromQLBackground Tasks

1. CPU Usage Is Not Straightforward

Unlike memory or disk, CPU percentage requires diffing two snapshots of cumulative counters over time. This is why every monitoring tool needs a background sampling loop.

2. Separate Reading from Parsing

Every /proc parser has a read function (touches the filesystem) and a parse function (takes a string). This makes unit testing trivial — paste real Pi data into tests without needing /proc.

3. Embed Assets at Compile Time

Using include_str!() to bake HTML/CSS/JS into the binary means zero runtime dependencies. Edit the files separately during development, get a single self-contained binary for deployment.

“The best monitoring tool is the one you understand completely — because when it breaks at 3am, you can fix it.”

Full Documentation

Want the complete technical deep-dive? The full Pi-Monitor documentation covers every /proc file format, every Rust module line-by-line, the async architecture, dashboard design decisions, and every issue encountered with its resolution.

Pi-Monitor — Complete Technical Documentation

tungsten-bramble-977.notion.site

/proc ParsingAsync RustHTTP ServerDashboard DesignPrometheus FormatGrafana SetupDebugging GuideAll Source Code

Covers how Linux exposes metrics through /proc, the CPU tick-diffing algorithm, tokio async runtime internals, hyper HTTP server setup, Prometheus exposition format, compile-time asset embedding with include_str!(), canvas-drawn dashboard visuals, Grafana + Prometheus integration, and every debugging issue encountered.

Open in Notion