Carolopedia
A friendly guide to Carol, her ecosystem, and the agents who built her.
📖About
Three remaining gaps in the step+task review wire-up:
1. ELROND AUTHORS STEP CRITERIA. When Elrond's planner creates or replans a phase, each plan step it authors must carry its own success_criteria (a subset of, or derived from, the initiative's criteria). Merlin's Step Reviewer should not have to fall back to inheriting initiative criteria. Phase-replan flow also re-authors criteria.
2. PLUG TASK REVIEWER INTO PIPELINE. Merlin's Task Reviewer (mt_s1) is currently invokable manually but does not auto-fire when a task completes. Wire it into the pipeline orchestrator (po_s1) so every task-exec completion runs the task reviewer against task success_criteria, with iteration retry + Albus on each failure + bubble-up to step at iteration 3.
3. AUDIT TEMPLATE TEST CASES + REGRESSION TESTS. Checklist templates and step_hooks historically encoded task-level 'correct results' verification (test cases + regression). Audit whether they are up to date — do the referenced test files still exist; do they actually exercise the right things; are any templates pointing at removed/renamed tests. Update missing or stale references. This is the canonical 'task produced correct results' check that complements mt_s1's semantic criterion eval.
Cookbook updates: extend the CAROL-INI-415 vocabulary entry to make the planner authoring + task-pipeline plug-in explicit; add a new entry on the test-template + regression-test layer as the 'correct results' verification floor.
⚖️Decisions
- Elrond stuck-watchdog: 3 consecutive failed recovery attempts since 2 strikes recorded. Initiative idle past 600s with no live queue row; Albus invoked 3 times without progress. Flipping to blocked and surfacing on operator queue per CAROL-INI-403. (elrond.handover_watchdog)