Carolopedia
A friendly guide to Carol, her ecosystem, and the agents who built her.
📖About & Usage
About
Executing the step — the build itself: analysis and spec-writing, design and dev work, and the executor/runner orchestration that turns plans into delivered code, with each role submitting its deliverable.
Where it fits
This is one stage of the Build Initiatives service. The owner and the agents who run it are listed under the team below, and the other blocks of the service are linked at the bottom of this page.
🛠️Team & droids
Merlin Block owner
Merlin is the execution conductor — once planning has handed off a phase of work, Merlin decides the order, dispatches it, watches it, and self-heals it, which is why the densest orchestration droids live with him in this block. The Merlin Sequencer sets the order in which each team member is involved, defaulting to the standard role rank (analyst, designer, builder, tester, reviewer) but honoring a sequence_hint override, validating the input and rejecting duplicates or malformed data, and emitting an ordered chain where each member's work depends on the previous one. The Step Execution Sequencer is the engine for a single phase: it deduplicates pending steps, builds a dependency tree, runs steps sequentially while keeping RAM under control, auto-diagnoses and fixes failures before retrying, detects stuck execution (no progress for 15+ minutes) and alerts, reviews task feedback for material issues that warrant a re-run, and creates prerequisite steps on the fly when the plan needs them. The Builder polls every 60 seconds for tool-build requests and scaffolds new tools immediately, so a step that needs a not-yet-existing tool doesn't stall waiting on manual coordination. This all matters because without Merlin's sequencing-and-dispatch layer the per-agent deliverables would have no one to run them in dependency order or recover them when they break. It fires across the whole execute phase — sequencing first, then dispatch, with the stuck-detection and self-heal paths covering the failure and hang scenarios.
Once a step's plan exists, the build itself has to run, and Albus is the architect voice inside that execution phase — he makes sure each step is shaped by sound high-level design before any other role touches it. The Architect Runner Glue droid is what fires here: it builds the execution context and invokes Albus's architect preflight and enabler droids during step execution (work that was relocated out of the pipeline module under CAROL-INI-0928-03/R8), so that the architectural framing runs inside the build pipeline rather than as a separate manual pass. Without this, steps would execute with no architect preflight and no enabler gating, letting design problems leak straight into Forge's code. Whatever high-level design Albus produces for the step is then persisted by the High-Level Design Submission droid, which calls submit(), inputs() and history() against the step-deliverables store so the hld becomes a tracked, versioned deliverable other roles can read as chain input. It fires early in execute_step, ahead of the design/dev/test runners, and feeds the deliverable chain that Archon, Forge and Argus draw from. In normal flow it runs once per step; the deliverable store keeps prior versions so reruns and reviews can compare against history.
Archon is the designer in the execution phase, and his job here is narrow but load-bearing: capture the concrete design for the step so it becomes a first-class, reviewable artifact rather than living only in a prompt. The Step Design Submission droid handles this by calling submit(), inputs() and history() through the step-deliverables store, giving Archon's design a tracked, versioned home and letting it read its own chain inputs (such as the upstream analysis and hld). This matters because Forge's development work downstream consumes that design as its chain input — if the design were never submitted, the dev runner would have nothing authoritative to build against and the deliverable chain would break. It fires during execute_step after analysis and architectural framing are available and before the developer runner does its work. In the normal one-pass flow it submits a single design version per step, but the history path means a re-run or a corrected design is versioned rather than overwritten, preserving the audit trail.
Argus is the tester, and within execution his presence is about making the test result a durable, chained deliverable rather than a transient pass/fail that disappears after the run. The Test Result Submission droid does exactly this: it calls submit(), inputs() and history() via the step-deliverables store to give Argus's test result a tracked, versioned home and to let it read the chain inputs it was produced against. This matters because the review block and the judge downstream rely on a recorded, attributable test outcome tied to the right step inputs — without it, there would be no canonical artifact proving the step was tested. It fires late in execute_step, after the design and dev deliverables exist and the actual testing has happened, closing out the per-step deliverable chain. Normally one result is submitted per step, but the versioned history path means a re-test after rework is captured as a new version instead of clobbering the original, keeping the evidence trail intact.
Elrond's role in execution is the spend-and-cycle guardrail that decides whether the next task is even allowed to start, so the pipeline doesn't blow its budget or loop endlessly on reviews. The Budget Gate droid watches two meters for him — the initiative's budget and its review-cycle limits — by reading those limits, summing up all spending so far and counting completed review phases, then deciding whether the gate is open (proceed) or closed (limits hit). This matters because without an automatic gate Elrond would have to manually verify every task start, and a runaway initiative could keep spawning work past its caps. It fires before the next task spawns, sitting at the threshold of each new unit of execution work. When there is still budget room and an available review slot it opens the gate and lets execution continue; when a limit is hit it closes the gate, escalates to Orion, and logs the gate status to the activity log so the stop is visible and attributable.
Forge is the developer in the execution phase, the role that actually turns a plan and a design into running code, and he owns the heaviest machinery in this block. The Developer Execution Runner builds the context and invokes Forge's Developer Executor, then runs the execute-with-retries loop (relocated from the pipeline module under CAROL-INI-0928-07/R9), so a single failed attempt doesn't kill the step. The Developer Executor itself takes the task request with its success criteria, gathers context about Carol's rules, relevant past work and design constraints, checks the task is even possible (files exist, no conflicts with other apps), uses Claude to generate Python code, runs that code in the Carol environment, and reports success-with-output, failure-with-error, or a correction if the task needs reworking. This is where the build genuinely happens, so without it there is no delivered code at all. The Dev Report Submission droid then persists Forge's dev report as a tracked, versioned deliverable so review can read it as chain input, and Forge's Reporter drafts the in-character Palantir wall post that attributes the work to Forge in his own voice. It fires after design is available and before review, with the retry loop covering the failure and rework scenarios.
Sage is the analyst, and in execution his job is to make sure every executor team starts from a precise spec rather than an unstructured task blurb. The Analyst Spec-Writer automates this: as soon as Merlin lays out a per-agent work plan, it reads the initiative and task data from the database and builds three structured Layer-B specs — one for design (what Archon needs to know), one for development (what Forge must do), and one for testing (what Argus must verify) — each carrying the task title, description, success criteria and execution focus, and writes all three before the executor droids run. This matters because it keeps the design, dev and test runners unblocked with the right spec structure and frees Sage to focus on higher-order analysis and tactical review instead of hand-crafting every spec. It fires right after per-agent planning and before the executors start, and if a write fails it retries automatically up to three times. Sage's analysis for the step is then persisted by the Step Analysis Submission droid, giving the analysis a tracked, versioned home that downstream roles read as chain input.