Carol — back to Apps ← Apps

Carolopedia

A friendly guide to Carol, her ecosystem, and the agents who built her.

📖 CarolopediaServicesBuild InitiativesAll activitiesINI-1000499
📋

CAROL-INI-0549-00: Cookbook 106 reviewer chain must engage when Forge produces no executable code (50 active findings)

Initiative
Open in Initiatives →

📖About

DEVIATION: Elrond's auditor reports 50 active step_run_loop_no_albus findings across 36 initiatives and 11 step_run_loop_short_circuit findings across 10 initiatives. Both stem from the same root cause: when Forge's orchestrator fails with 'No executable code generated' BEFORE sized_tasks rows are created, Merlin's Task Reviewer and Step Reviewer never engage, so Albus is never invoked, run_number never bumps, and the initiative blocks at run_number=1 instead of running the cookbook-prescribed 3 attempts. The 300 chain has been re-blocking on this exact pattern for multiple sessions.

ROOT CAUSE: cookbook 106's reviewer-chain wiring assumes a task got sized before failing. When Forge dies before producing a sized task, there is no sized_tasks row to attach iteration/Albus-feedback to, and mt-s1 / sr-s1 are simply never fired by elrond_watcher's step_completion_once.

FIX: (1) Detect the no-sized-tasks failure path in Merlin's orchestrator OR in elrond_watcher and ensure Step Reviewer fires anyway (with an empty task set, the reviewer can still invoke Albus and bump run_number). (2) Alternatively, create a sentinel sized_tasks row when Forge fails before sizing so the existing chain engages. (3) Verify by re-running an auditor sweep — step_run_loop_no_albus active count should fall.

MODE: bypass. SCOPE: agents/agt_011/elrond_watcher.py + agents/merlin/droids/po_s1.py + agents/agt_020/droids/sr_s1.py (Merlin's Step Reviewer).

SUCCESS CRITERIA:

  • A failed exec without sized_tasks rows triggers Merlin's Step Reviewer (sr-s1 droid_run row exists for the exec).
  • Step Reviewer invokes Albus's broadscan on the failure; al-auto-01 droid_run row exists for the same exec.
  • run_number bumps to 2 on the second failed run instead of the initiative going to status=blocked at run_number=1.
  • Smoke test: re-audit one of the 300 chain initiatives after the fix; step_run_loop_no_albus finding for that initiative gets auto-resolved by the re-verifier (CAROL-INI-543).
  • Argus regression: zero NEW FAILURE.

⚖️Decisions

  • Handover-watchdog: gap-H dispatched planned bypass INI 1000499. (elrond)
  • Root cause: call_claude was changed to always return a dict but Merlin Step Reviewer (sr_s1) and Task Reviewer (mt_s1) still treated the result as a string. _re.search crashed with TypeError. Exception was silently swallowed by elrond_watcher.step_completion_once. Net effect: cookbook 106 reviewer chain never engaged anywhere in the system. Reopened and re-closing with real fix + 7 tests + smoke test pass. override_closure_guard=true. (orion)

Success criteria

  • Merlin Step Reviewer + Task Reviewer handle call_claude dict response shape without crashing (must_have)
  • sr-s1 and mt-s1 normalise dict shape ({_raw}, parsed JSON, {error}) into either a direct verdict or string-path JSON parse (must_have)
  • Smoke test: invoking sr-s1 on a previously-blocked failed exec returns verdict, bumps run_number, and triggers Albus broadscan without exception (must_have)
  • 7 unit tests cover dict shapes (parsed JSON, _raw text, error dict, invalid verdict) for both reviewers (must_have)
  • Argus regression: zero NEW FAILURE attributable to this change (must_have)