Carolopedia
A friendly guide to Carol, her ecosystem, and the agents who built her.
🎯Key functional considerations
This is the front door for all engineering work in Carolverse, so its architecture is shaped by what it must guarantee end-to-end:
- Two intake modes that share one path. Autonomous (planner) and operator-driven (bypass) work flow through the same file → plan → execute → review → sign-off lifecycle, so behaviour and guarantees do not fork.
- Budget-aware sequencing. A sprint-based backlog packs work under a fixed monthly budget; a tunable priority score (incident severity, urgency, age, dependents, roadmap value) decides order, with critical incidents in a hard top tier.
- Sub-second incident filing. A breakage must be filed and surfaced fast, so the incident path skips the expensive AI classification steps.
- Self-healing. A missing capability is built on demand rather than blocking; failures are detected and re-planned.
- Accountable and observable by construction. Every action is tagged to an agent and a doing droid, a security gate authorises each filing, and every initiative shows on a monitor with a deterministic state.
🧰Technologies used
- Python 3 services on FastAPI / Flask, served by uvicorn/gunicorn behind nginx.
- SQLite (WAL) datastores. The initiatives database sits behind a single-writer relay so concurrent droids cannot cause a write-lock storm.
- systemd and cron schedule the recurring droids; a flock serialises ticks and a separate heartbeat database absorbs run-audit writes so they never contend with real work.
- Claude drives the AI build steps — Sonnet for most, Opus for the hardest reasoning.
- Git holds the code under change; the close gates diff against the start commit.
- The registry and the design store are the binding sources of truth the pipeline reads from.
🏗Solution architecture
The service is a pipeline of blocks — the distinct steps shown in the Blocks section of the service page — wired together by a small control plane. It is a direct instance of Carolverse's agent-centric modular architecture: each block is owned by an agent and carried out by that agent's droids (see the service's team and blocks above).
- Two planner layers. Phase-level planning sequences the work; task-level planning breaks each phase into steps.
- A single mutator. One status-router is the only thing that changes an initiative's status and state together, so the monitor cards are a pure function of state (executing → Current, reviewing/closed → Recent, blocked/redirected → Escalation, dispatched → the queue).
- A rolling dispatch queue. The top-priority initiatives sit in a short dispatch window; as one enters execution the next backfills.
- Skills as the unit of capability. A meta-skill lets the pipeline author and wire a brand-new skill, and a skill-gap healer inserts a build-the-skill step ahead of any step whose skill is missing.
- A filing security gate runs at creation: authorisation, role-vs-content alignment, and policy compliance.
📐Design principles followed
- Single source of truth. Counts and narratives come from the live registry and design store, never hand-copied — the shared principle described on the Carolverse Architecture page.
- Single-writer mutation. Status changes go only through the status-router; the database is written through one relay.
- Self-heal over block-and-wait. Build the missing capability and continue rather than escalate and stall.
- Agent-centric modular architecture. Every block has an accountable agent and a doing droid (Design #146).
- Bypass skips the planner, not the standards — same template checklist, review and observability as an autonomous run.
- Observability first. If it is not on a monitor with a deterministic state, it is not done.
✅Success criteria
- An initiative flows from file to close without manual nudging in the common case.
- No false blocks — reviewing / awaiting-UAT is never escalated by the stuck-watchdog.
- Incidents are fast — critical breakages file in well under a second and surface at top priority.
- Every closed change carries a twin-review verdict and passes the design/architecture and caller-audit gates.
- Monitor cards always match reality — status and state move together, atomically.
- No lock storms and no silent loss of run-audit history.
- New skills are created on the go — when a step needs a capability that does not exist yet, the pipeline builds and wires the skill rather than blocking.
- Albus wakes up when things fail — a failed step trips Albus's troubleshooter, which diagnoses the root cause and drives the fix.
- Auto-healing is enabled — failures are detected, re-planned and remediated automatically, without a human in the loop.
🛡Service-specific policies
These are the rules the build pipeline is bound by, drawn from the Carol policies and the build cookbook:
- Agents own, droids execute. Every step is run by a named droid, every process has a droid behind it, and an agent never self-executes. (policies P.01.03.02.01 / P.01.03.01.01 / P.01.03.01.04)
- Every CLI action is tagged to a droid. A bypass refuses to open without a droid tag and must pass its hygiene gates. (cookbook: bypass v3 droid-tag mandate)
- Only Elrond files; everyone else requests. Any agent can raise a request, but the act of filing — and every lifecycle operation — routes through Elrond, never a raw write. (cookbook: only Elrond files / lifecycle ops route through Elrond)
- A bypass must declare what it remediates. It either remediates a named initiative or explicitly remediates nothing, and a remediating bypass is a separate initiative from the original. (cookbook: remediation-linkage mandatory contract)
- Bypass keeps planner parity. Bypass skips the planner's orchestration, not its standards — same data shape, gates and review as a planner run. (cookbook: bypass structural parity)
- No doer-to-doer handshakes. Work flows through the orchestrator, Merlin, not directly between doers. (cookbook: orchestrator-pattern guardrail)
- Gates at filing and dispatch. Compliance is enforced at the filing stage first, with dispatch-stage defence-in-depth. (cookbook: filing-stage primary, dispatch-stage defense-in-depth)
- Cross-agent, multi-check review. Every change is reviewed by a different agent than built it, with a three-check review and a verifying twin droid. (policies P.01.02.05.05 / P.01.02.05.02 / P.01.03.02.03)
- Dev and prod stay separate; apps stay thin. Work is built in dev and shipped to prod, and apps carry no business logic — they call the agent layer. (policies P.03.02.02.01 / P.01.03.01.05)
📦End-user deliverables
Current
- File a build initiative with decisions, requirements, success criteria, strategy, plan steps and budget — enabled by Elrond, via his Initiative Creator, Requirements Author and Filing Gate.
- Autonomous (planner) or operator-driven (bypass) builds — Merlin orchestrates (his Sequencer), Forge writes the code (his Builder), Argus tests (his Tester), and Elrond's Initiative Planner sequences the work; in bypass, Orion drives it.
- Sprint-based backlog, daily sprint plan, rolling dispatch queue and defer — Elrond, via his Sprint Builder, Daily Planner and Dispatcher.
- A monitor dashboard with deterministic Current / Recent / Escalation / Dispatch cards — Elrond, via his Foreman and the status-router, with Hermione's monitors feeding it.
- A per-initiative Palantir story wall — Elrond, via his Reporter, which auto-composes the story at close.
- Twin review, design/architecture and caller-audit gates, and UAT — Themis checks compliance (her Design and Architecture Compliance droids), Argus verifies (his Verifier), Elrond's Step Reviewer and Initiative Review grade the work, and Orion's Twin Reviewer cross-checks.
- Self-healing skills and a sub-second incident fast-path — Sage builds and heals the skills, and Elrond prioritises incidents.
- A daily process sweep that files a fix when a scheduled job fails — Hermione's Process Monitoring detects and files it, and Elrond's pipeline fixes it.
Future (on demand)
- Auto-dispatch of dispatched sprint items (today the operator triggers the push).
- A weekly sprint review that files self-improvement initiatives for the planner.
- Editor surfaces for defer and mode-change (today these are endpoints only).
- Relay hardening for blob columns and a full single-writer relay for the planner database.
- Cached policy briefs to cut filing-security-gate latency.
📘End-user run book
Tools an agent uses to interact with this service
This is how another agent — for example Carol — talks to the build pipeline. The service exposes these function-call tools, so an agent can interact with Elrond without any human in the loop:
- request_initiative_creation — ask Elrond to file a new build initiative (raise a request to build or change software).
- query_initiative_status — check where an initiative stands (planned, executing, reviewing or closed), in plain language.
- request_initiative_dispatch — ask Elrond to start a planned initiative (push it into execution).
- diagnose_initiative_failure — ask why an initiative failed or is blocked and get a plain-language diagnosis.
- request_initiative_closure — ask to close or sign off a finished initiative.
- lookup_cookbook — search the build cookbook for the recipe relevant to a task.
- list_cookbook_sections — list every cookbook entry with a one-line overview.
- lookup_induction — search the build-pipeline induction material.
- list_induction_sections — list every induction section.
Operator path (HTTP)
The operator can reach the same lifecycle over the initiatives app directly:
- File — POST
/api/initiativeswithtitle,status=planned,requester(agent id),owner(same id) andrequested_mode(bypass or planner). - Add success criteria at
/api/initiatives/{id}/success-criteriaand a topic tag at/api/initiatives/{id}/tags. - Dispatch — planner mode dispatches from the queue; a bypass claims and activates the initiative when it opens.
- Track — watch the Monitor tab; a bypass parks in reviewing tagged uat-pending, then UAT closes it.
- What's stuck — GET
/api/initiatives?status=blocked(read through the API, not the database file).
Where the rules live
The cookbook is operational law; the RUNBOOK card on the Monitor tab is the canonical dispatch manual.