Carol — back to Apps ← Apps

Carolopedia

A friendly guide to Carol, her ecosystem, and the agents who built her.

📖 CarolopediaServicesBuild InitiativesAll activitiesINI-999900332
📋

CAROL-INI-2099-00: Enable Albus to root-cause and avoid pipeline failures: SST+Logbook context + root-cause mission reframe

Initiative
Open in Initiatives →

📖About

Albus's troubleshooter can wake (CAROL-INI-2097) but cannot solve PIPELINE-level failures (e.g. INI-9 review-failed because the reviewer reads task-evidence from a table no code populates). Root cause of HIS limitation: (a) his context only shows the initiative's own touched files + the reviewer's verdict TEXT, never the pipeline's structure or the responsible component's code; (b) his mission is framed as 'fix the failed step', so he tries to redo the worker's task instead of removing the root cause. ENABLE him (Ninad's directive): his action is to address the ROOT CAUSE so that EITHER (1) the failure is avoided OR (2) the review failure is avoided — by fixing/routing the responsible component, NOT by doing the failed worker task himself. BUILD: (1) shared/agent_reference_library.py: a compact 'system reference library' = the SST where-truth-lives map (constitution, policies, designs, requirements, cookbook, SST, registry/service-catalogue — what each holds + the exact query command) + Orion's Logbook index (recent build-story titles + how to read) + the registry ownership pointer so he can resolve component->owner. Given by DEFAULT in his context (Ninad: give Albus SST + Logbook by default, let him read further as needed). (2) Reframe Albus's troubleshooter MISSION: remove the root cause to avoid the failure or the review-failure; do NOT redo the failed task (the doer owns that); the root cause is often the build machinery itself (reviewer/sizer/dispatcher/evidence-flow/gate) — use the SST to locate the responsible component + ITS OWNER and the Logbook to learn why it is built that way; address it where it lives (DIRECT_FIX in his own bypass session even on pipeline code, or route to the owner / file a fix-initiative / escalate to Orion with a STRUCTURED root cause). Avoid the verdict-treadmill: route by OWNER (data from the SST), reusing the existing verdicts as mechanisms — do NOT add a verdict per failure class. (3) cookbook + regression.

⚖️Decisions

  • Auto-detected remediation target INI-999900330 from title/description scan (matched CAROL-INI-2097 -> row id 999900330 (CAROL-INI-2097-00: A step/initiative reaches done/closed/blocked only for a real)); override by setting remediates_initiative_id explicitly at bypass_start. (system-auto-detect)
  • Elrond's bypass methodology checklist (a reminder, not a gate -- you've got this): 0. File it requested_mode='bypass' (planner-vs-bypass is a deliberate choice). bypass_start REFUSES a non-bypass initiative (CAROL-INI-1846), and the dispatcher only skips the bypass lane when the mode says bypass -- a 'planner' mistag lets Merlin's pipeline grab the placeholder step and block your finished work. 1. Filed as planned status -- let the bypass claim/activate it; never file active. 2. Open the bypass (bypass_start) with your droid id + the remediation answer (remediates_initiative_id=NNN, or remediates_nothing=True). 3. Work the blocks for your work-type: template -> design -> code -> test -> review. Do the real work; record decisions on the initiative as you make them. 4. Reality is recorded for you at close -- code (files changed), each decision, and the twin-review verdict become real activities tied to this initiative and show in the Activity Tracker like a planner run (CAROL-INI-1840). No dummy rows. 5. Keep the initiative status moving; it parks in 'reviewing' and is tagged uat-pending for you at close (CAROL-INI-1836), so the stuck-watchdog leaves it alone until UAT. 6. Close runs the gates (design/architecture compliance + caller-audit). If a gate flags something pre-existing or unrelated to your change, waive it with a clear written rationale -- audit, don't skip. 7. Bypass skips the planner's auto-orchestration, NOT the standards. Same template checklist, same review, same observability as a planner run. (elrond)
  • [status-router] planned -> executing | event=bypass_executing | bypass transition (or-bx-01)
  • [status-router] executing -> reviewing | event=bypass_reviewing | bypass transition (or-bx-01)
  • [status-router] reviewing -> closed | event=operator_signoff | Auto-accepted (CAROL-INI-1859): Orion-initiated, >2 days in reviewing with no objection. (el-srac-01)

Success criteria

  • Albus troubleshooter context includes a system reference library by default: the SST where-truth-lives map (constitution/policies/designs/requirements/cookbook/SST/registry + query commands) and the Orion Logbook index (must_have)
  • Albus mission is reframed to remove the ROOT CAUSE so the failure or the review-failure is avoided, NOT to redo the failed worker task; he is told to suspect the build machinery and use the SST to find the responsible component and its owner (must_have)
  • No new verdict-per-failure-class is added; routing is by owner using the existing verdicts (DIRECT_FIX in his bypass / route to owner / file fix-initiative / escalate to Orion with a structured root cause) (must_have)
  • cookbook note + regression added (must_have)