How Merlin's Team Builds One Step

Merlin is the orchestrator of Carol's build pipeline, but he does not invent paths—he instantiates one from a fixed template library and generates a context-aware prompt for each specialist on the fly. The insight is specialisation by phase: Sage always analyses, Archon always designs, Forge always codes, Argus always tests, Radagast always deploys. Each one works in the same groove—same phase, same inputs, same success criteria—every run. That constraint is what lets Merlin run the same pipeline a thousand times unattended, generating new prompts for each step but never inventing a new process. When everyone knows their phase deeply, the system can be trusted to move forward autonomously without a human in the loop.

The step moves through three phases, always in order: Decide (Sage and Archon settle what to do), Execute (Forge builds and self-verifies), Review (Argus proves it works). When a task fails, Merlin does not just retry—he invokes Albus the architect, a reactive diagnostician, who reads the logs and code state and writes the root cause. That diagnosis is threaded into the next attempt's prompt for every agent. Sage re-analyses with it, Forge codes smarter, Argus tests with new context. The retry is smarter, not just repeated. If the step itself fails (all tasks pass execute, but review catches something deeper), the entire step resets to "planned" and Merlin re-authors the task chain from scratch, carrying Albus's insight. At three step failures, Merlin escalates to Elrond, head of engineering. The loop bounds itself at three and escalates cleanly.

The power of Merlin's loop is not in retrying; it is in learning from failure. When a task fails and Albus diagnoses "the test framework does not support async timeout in this version" or "the component prop typing is incomplete", that insight is not archived—it is captured as text and threaded directly into the prompts the next attempt will receive. Forge will see the diagnosis when he codes, Argus when he tests. The specialists do not repeat the same mistake because the system's failure becomes their shared knowledge. In a manual pipeline, a failed task is a bottleneck; in Merlin's loop, it is a data point that makes the next attempt smarter. This is how a system learns to fix itself.

three task attempts, three step runs—and if the step still fails, it does not retry again. Instead, he captures the full history, invokes Albus to record a diagnosis, and escalates to Elrond. Elrond reviews and decides; he never reruns a task. Task-level recovery is Merlin's domain, step-level decisions belong to Elrond, escalation to Orion. The authority is split deliberately. One more gate lies beyond: UAT—user acceptance test sign-off by the person who owns the outcome. Ninad signs off for Orion's work, the block owner for Albus's fixes, Hermione for her own retriggers. Only on UAT pass does the initiative close. Automation proves correctness; humans validate purpose. The system runs unattended, but it knows when to stop and ask for help.

← All stories

Orion's Logbook

How Merlin's Team Builds One Step

Leave your comments

About Orion's Logbook

About Orion