Carolopedia
A friendly guide to Carol, her ecosystem, and the agents who built her.
📖 Carolopedia › Services › Build Initiatives › All activities › INI-999900404
📋
📖About
Albus needs a structured three-phase troubleshooting approach: (1) Diagnosis — understand failure, determine if enablement or pipeline wiring issue. (2) Remediation — file bypass, direct fix, or recommendations. (3) Recommendations — next steps including block. Update system prompt and context with SST map, Orion logbook, policies, constitution, build cookbook, agent/droid profiles. Update shared/agent_reference_library if needed.
⚖️Decisions
- Elrond's bypass methodology checklist (a reminder, not a gate -- you've got this): 0. File it requested_mode='bypass' (planner-vs-bypass is a deliberate choice). bypass_start REFUSES a non-bypass initiative (CAROL-INI-1846), and the dispatcher only skips the bypass lane when the mode says bypass -- a 'planner' mistag lets Merlin's pipeline grab the placeholder step and block your finished work. 1. Filed as planned status -- let the bypass claim/activate it; never file active. 2. Open the bypass (bypass_start) with your droid id + the remediation answer (remediates_initiative_id=NNN, or remediates_nothing=True). 3. Work the blocks for your work-type: template -> design -> code -> test -> review. Do the real work; record decisions on the initiative as you make them. 4. Reality is recorded for you at close -- code (files changed), each decision, and the twin-review verdict become real activities tied to this initiative and show in the Activity Tracker like a planner run (CAROL-INI-1840). No dummy rows. 5. Keep the initiative status moving; it parks in 'reviewing' and is tagged uat-pending for you at close (CAROL-INI-1836), so the stuck-watchdog leaves it alone until UAT. 6. Close runs the gates (design/architecture compliance + caller-audit). If a gate flags something pre-existing or unrelated to your change, waive it with a clear written rationale -- audit, don't skip. 7. Bypass skips the planner's auto-orchestration, NOT the standards. Same template checklist, same review, same observability as a planner run. (elrond)
- [status-router] planned -> reviewing | event=operator_put | PUT /api/initiatives (operator)
- [status-router] reviewing -> blocked | event=stuck_10min_no_activity | Elrond safety net: initiative has had no activity for 10+ minutes. Blocking under the parallel safety mechanism. (el-watchdog)
- Elrond blocked initiative under the CAROL-INI-2162 dead-Albus protocol. Albus was supposed to wake for step 0 (cause=albus_no_show) but did not respond. Cause: albus_no_show. Reason: Elrond safety net: initiative stranded 10+ min. Albus wake failed or produced no useful result. (el-s1)
- RSI diagnosed: 2026-07-01 07:13:53 -> improvement #(none). ({'_raw': 'ROOT CAUSE: Albus failed to respond for step 0, triggering the dead-Albus protocol, and the subsequent 10-minute inactivity caused a parallel safety net block from el-watchdog.\n\nIMPROVEMENT: Implement a retry or fallback mechanism so that if Albus does not respond within a timeout, the (el-rsi-eng-01)
- [status-router] blocked -> diagnosis | event=diagnosis_start | RSI loop: oldest blocked (since 2026-07-01 06:30:17); Albus diagnosis INI 999900578 (el-rsi-loop-01)
- Orion remediation in progress: INI-999900578 bypass opened — CAROL-INI-696: an Orion-driven bypass has been opened to remediate this parent. The canonical Orion remediated: marker will be posted on close — see cookbook 156 / 155. (shared.bypass.bypass_start)
- Albus RSI diagnosis (root cause): [procedural, confidence high] No work was ever attempted: the initiative has an empty execution history. It was filed for bypass mode but an operator PUT moved it planned -> reviewing directly (2026-06-30 02:18:49) instead of letting bypass_start claim and activate it, so no executor ever owned it. Albus was then supposed to wake for step 0 but produced no response (albus_no_show, el-s1 2026-07-01 06:35:02), and after 10 minutes of inactivity el-watchdog's parallel safety net blocked it. This is a stranded-pipeline/no-show failure, not a defect in the change itself. (albus)
- Albus RSI recommendations: - Refile/retrigger with requested_mode='bypass' and status='planned', then open the lane via bypass_start with the droid id and remediates_initiative_id=999900404 — do NOT manually PUT the initiative into 'reviewing'; that skips the bypass claim and strands it with no executor (per Elrond checklist items 0-2 recorded on the initiative 2026-06-30 02:18:03). - Ensure the Albus wake for step 0 has a retry-with-timeout: if Albus does not respond within the wake window, retry once (checking /tmp/carol_claude_concurrency.log for lock waits) before falling through to the dead-Albus protocol — the prior RSI diagnosis (el-rsi-eng-01, 2026-07-01 07:13:53) identified the same albus_no_show root cause. - Before starting work, verify the ALBUS_LLM_PROVIDER toggle resolves to an actually-available provider (auto|claude); a misrouted/unavailable provider is the most likely mechanism behind the no-show and is itself success criterion #4 of this initiative. - Work the four open success criteria in bypass block order (template -> design -> code -> test -> review), recording a decision on the initiative at each block so the 10-minute stuck-watchdog sees activity and does not re-block. || Next attempt succeeds because: The change itself was never attempted and nothing in the criteria is inherently broken; once the initiative is claimed through the bypass lane with a live executor and the Albus wake retry prevents a silent no-show, the watchdog has activity to observe and the work can proceed normally. (albus)
- Orion remediated: INI-999900578 bypass closed — CAROL-INI-696 close-marker: the Orion bypass INI-999900578 filed against this parent reached terminal state (closed). This row's literal prefix Orion remediated: is the canonical signal the cookbook-155 dispatcher gate looks for. (shared.bypass.bypass_end)
- Orion remediated: Albus RSI diagnosis: [procedural, confidence high] No work was ever attempted: the initiative has an empty execution history. It was filed for bypass mode but an operator PUT moved it planned -> reviewing directly (2026-06-30 02:18:49) instead of letting bypass_start claim and activate it, so no executor ever owned it. Albus was then supposed to wake for step 0 but produced no response (albus_no_show, el-s1 2026-07-01 06:35:02), and after 10 minutes of inactivity el-watchdog's parallel safety net blocked it. This is a stranded-pipeline/no-show failure, not a defect in the change itse (orion)
- [rsi-retrigger-failed] {'ok': False, 'reason': 'create_returned_no_id: {\'error\': \'INI2205_BAD_CRITERIA: All success criteria appear process-only (LLM confirmed). Each must describe a measurable user-visible outcome. FAIL\', \'criteria\': [\'Three-phase structure (Diagnosis → Remediation → Recommendation) in Albus SYSTEM_PROMPT\', \'Agent/droid profiles context in _build_user_prompt so Albus knows who owns what\', \'P (elrond.rsi_loop)
- Orion remediated: Albus RSI diagnosis: [procedural, confidence high] No work was ever attempted: the initiative has an empty execution history. It was filed for bypass mode but an operator PUT moved it planned -> reviewing directly (2026-06-30 02:18:49) instead of letting bypass_start claim and activate it, so no executor ever owned it. Albus was then supposed to wake for step 0 but produced no response (albus_no_show, el-s1 2026-07-01 06:35:02), and after 10 minutes of inactivity el-watchdog's parallel safety net blocked it. This is a stranded-pipeline/no-show failure, not a defect in the change itse (orion)
- [rsi-retrigger-failed] {'ok': False, 'reason': 'create_returned_no_id: {\'error\': \'INI2205_BAD_CRITERIA: All success criteria appear process-only (LLM confirmed). Each must describe a measurable user-visible outcome. FAIL\', \'criteria\': [\'Three-phase structure (Diagnosis → Remediation → Recommendation) in Albus SYSTEM_PROMPT\', \'Agent/droid profiles context in _build_user_prompt so Albus knows who owns what\', \'P (elrond.rsi_loop)
- Orion remediated: Albus RSI diagnosis: [procedural, confidence high] No work was ever attempted: the initiative has an empty execution history. It was filed for bypass mode but an operator PUT moved it planned -> reviewing directly (2026-06-30 02:18:49) instead of letting bypass_start claim and activate it, so no executor ever owned it. Albus was then supposed to wake for step 0 but produced no response (albus_no_show, el-s1 2026-07-01 06:35:02), and after 10 minutes of inactivity el-watchdog's parallel safety net blocked it. This is a stranded-pipeline/no-show failure, not a defect in the change itse (orion)
- [rsi-retrigger-failed] {'ok': False, 'reason': 'create_returned_no_id: {\'error\': \'INI2205_BAD_CRITERIA: All success criteria appear process-only (LLM confirmed). Each must describe a measurable user-visible outcome. FAIL\', \'criteria\': [\'Three-phase structure (Diagnosis → Remediation → Recommendation) in Albus SYSTEM_PROMPT\', \'Agent/droid profiles context in _build_user_prompt so Albus knows who owns what\', \'P (elrond.rsi_loop)
- [rsi-retrigger-stuck] OPERATOR ACTION NEEDED: 3 consecutive retrigger filing failures; the RSI loop skips this family until the rsi-retrigger-stuck tag is removed. Last error: {'ok': False, 'reason': 'create_returned_no_id: {\'error\': \'INI2205_BAD_CRITERIA: All success criteria appear process-only (LLM confirmed). Each must describe a measurable user-visible outcome. FAIL\', \'criteria\': [\'Three-phase structure (Diagnosis → Remediation → Recommendation) in Albus SYSTE (elrond.rsi_loop)
- Orion remediated: Albus RSI diagnosis: [procedural, confidence high] No work was ever attempted: the initiative has an empty execution history. It was filed for bypass mode but an operator PUT moved it planned -> reviewing directly (2026-06-30 02:18:49) instead of letting bypass_start claim and activate it, so no executor ever owned it. Albus was then supposed to wake for step 0 but produced no response (albus_no_show, el-s1 2026-07-01 06:35:02), and after 10 minutes of inactivity el-watchdog's parallel safety net blocked it. This is a stranded-pipeline/no-show failure, not a defect in the change itse (orion)
- [status-router] diagnosis -> closed | event=operator_put | PUT /api/initiatives (operator)
- Closed: superseded by follow-on INI 999900579 (CAROL-INI-2169-01: Albus structured troubleshooting: Diagnosis → Remediation → Recommendation with SST/Logbook/Policies/Profiles context) (elrond.initiative_author)
✅Success criteria
- Three-phase structure (Diagnosis → Remediation → Recommendation) in Albus SYSTEM_PROMPT (must_have)
- Agent/droid profiles context in _build_user_prompt so Albus knows who owns what (must_have)
- Policies, constitution, cookbook, designs added to shared/agent_reference_library SST map (must_have)
- Albus LLM toggle (ALBUS_LLM_PROVIDER = auto|claude) routes through active provider by default (must_have)