Doctor architecture
The technical picture of Doctor for builders and auditors: the supervision model, the repair ladder, health classification, and the trust posture. Grounded in the Doctor technical manual.
Doctor architecture
The technical picture of Doctor for builders and auditors: the supervision model, the repair ladder, health classification, and the trust posture. Grounded in the Doctor technical manual.
Related:
#Overview
Doctor is a zero-dependency watchdog. Its package declares no runtime dependencies; only the platform's own building blocks are used. HTTP probing uses the built-in HTTP client, telemetry reads use the built-in read-only SQLite, and shell-outs use argument arrays, never a shell. The build tooling never ships.
#Supervision model
The operating system supervises Doctor, and Doctor supervises everything else. The OS service manager restarts Doctor on a crash and starts it on boot. Doctor deliberately has no "restart myself" code path, because a self-restart would put the watchdog inside a failure it is supposed to stand outside of.
#Multi-daemon registry
Doctor reads a static registry of daemons and spawns one fully independent supervisor per entry, each with its own probe, backoff, ladder, state, and incident record. A daemon that is down is still supervised, because "should exist" survives independently of "is running." A missing registry falls back to the memory daemon as the primary. A malformed registry does not crash-loop; it falls back, logs the fallback, and records a needs-attention banner.
#Health classification
One HTTP request resolves exactly one of four kinds: healthy, degraded with per-subsystem reasons, unreachable because the connection was refused, or unreachable because it timed out. The refused-versus-timeout distinction, down versus wedged, drives a targeted repair rather than a blind restart. The probe never throws.
#The repair ladder
When a daemon is sick, Doctor climbs:
- Restart it.
- If restarts keep failing, reinstall it.
- If a conflicting global package is detected, remove it.
- Escalate.
Backoff between rungs is geometric, with a floor and a ceiling, and the ladder stops the instant health returns. Escalation is the terminal hand-off, not a rung: it builds a record with the diagnosis, the steps tried, and the recommended action, plus, for any deferred action, a note of what it would have done.
#The blessed-update gate
Doctor auto-updates the memory daemon only behind a gate: a version must be explicitly approved for rollout, the update is verified healthy afterward, and a failed verify rolls back to the last working version. A bad release cannot spread itself. Doctor never auto-updates its own package; a single explicit command is the only way that happens.
#Ports and the status page
Doctor serves one HTTP listener on 127.0.0.1:3852, with the human-readable status page, the machine-readable status feed, and the live health stream. It is read-only by construction: no route mutates, proxies, or triggers an action, and nothing binds to a public address. The daemons it watches sit on their own ports: the memory daemon on 3850, its embeddings child on 3851, the portal on 3853, and the codebase daemon on 3854.
#Telemetry, single source of truth
Each service writes non-sensitive telemetry to its own local database. Doctor polls those read-only, merges them with health into one authoritative in-memory picture of the fleet, and feeds exactly one stream to the portal. Anything that leaves the machine passes a single chokepoint with allow-list scrubbing and layered opt-out gates.
#Defaults
Probe every 30 seconds, a 2-second per-probe timeout, a 60-second startup grace so a booting daemon is not judged dead, give up on restarts after three consecutive failures, a 5-second post-restart cooldown, and a backoff floor of 1 second and ceiling of 30 seconds.
#Credential safety
There is no code path in Doctor that reads, writes, or deletes the credentials file. A suspected credential fault is escalated with a recommendation, never automated, and there is deliberately no command to clear credentials.
#License
Released under the GNU Affero General Public License, version 3.0 or later.