Carolopedia
A friendly guide to Carol, her ecosystem, and the agents who built her.
📖About
The step dispatcher currently allows execution to begin without confirming that any evidence-capture mechanism is wired to verify completion, causing steps to exhaust all retry attempts before the gap is detected. CAROL-INI-0055-00 (surfacing agent and droid prompts in the Org employee profile) was blocked at step 284 for exactly this reason: the step produced no observable, verifiable output and the pipeline had no way to know it was incomplete until retries were exhausted. This initiative adds a fail-fast pre-check at dispatch time that requires every step to declare how its success evidence will be captured before execution begins, then re-queues the blocked initiative once the fix is live.
⚖️Decisions
- Evidence-capture declaration is part of the step definition — not a post-hoc check — so the gap is caught before a step starts, not after retries are exhausted. (orion)
- The fail-fast gate fires at dispatch time; steps that declare no evidence mechanism are rejected immediately and surface a clear, actionable error rather than a silent retry loop. (orion)
- The blocked Org prompts initiative step must have its evidence mechanism explicitly wired (identifying the observable output that proves the Prompts tab renders correctly) before it is re-queued. (orion)
- Existing steps that already have evidence mechanisms wired must continue to execute without modification or disruption. (orion)
- [status-router] planned -> dispatched | event=operator_dispatch | RSI: immediate dispatch (INI-2198) (or-bx-01)
- [status-router] dispatched -> planned | event=dispatch_retract | No longer in the top-3 dispatch window (CAROL-INI-1972). (spb-01)
- [status-router] planned -> dispatched | event=operator_dispatch | RSI: immediate dispatch (or-bx-01)
- [status-router] dispatched -> planned | event=dispatch_retract | No longer in the top-3 dispatch window (CAROL-INI-1972). (spb-01)
- [status-router] planned -> dispatched | event=operator_dispatch | RSI: re-dispatched (or-bx-01)
- [status-router] dispatched -> planned | event=operator_revert | Replaced by RSI autonomous loop (or-bx-01)
- [status-router] planned -> dispatched | event=operator_dispatch | RSI: direct dispatch per cookbook (or-bx-01)
- [status-router] dispatched -> planned | event=dispatch_retract | No longer in the top-3 dispatch window (CAROL-INI-1972). (spb-01)
- [status-router] planned -> dispatched | event=dispatch | RSI: auto-promoted bypasses depth limit (CAROL-INI-2198) (spb-01)
- [status-router] dispatched -> blocked | event=stuck_10min_no_activity | Elrond safety net: initiative has had no activity for 10+ minutes. Blocking under the parallel safety mechanism. (el-watchdog)
- Elrond blocked initiative under the CAROL-INI-2162 dead-Albus protocol. Albus was supposed to wake for step 0 (cause=albus_no_show) but did not respond. Cause: albus_no_show. Reason: Elrond safety net: initiative stranded 10+ min. Albus wake failed or produced no useful result. (el-s1)
- RSI diagnosed: 2026-07-01 16:06:33 -> improvement #(none). ({'_raw': "ROOT CAUSE: The initiative repeatedly fails because Albus does not wake for step 0 (albus_no_show) and gets stuck for 10+ minutes, triggering Elrond's safety net block.\n\nIMPROVEMENT: Add a watchdog timeout or retry logic for Albus wake-up calls to automatically reattempt or escalate bef (el-rsi-eng-01)
- Orion remediated: Albus RSI group diagnosis (via INI 999900430): [procedural, confidence high] Albus executor did not wake to execute step 0 of the initiative (albus_no_show), as evidenced by the Elrond safety net decision and the prior RSI diagnosis for the same cause group. The initiative remained idle for over 10 minutes, triggering the parallel safety mechanism. No actual work was attempted; the single execution history entry shows 'Idle close — no activity in window.' (orion)
- Orion remediated: Albus RSI group diagnosis (via INI 999900401): [procedural, confidence high] The initiative was filed as 'bypass' mode but the required bypass_start activation was never invoked by the requester (orion), leaving the initiative idle in 'reviewing' status with no executor assigned. The Elrond watchdog then blocked it after 10 minutes of inactivity (stuck_10min_no_activity). (orion)
- Orion remediated: Albus RSI group diagnosis (via INI 1000166): [procedural, confidence high] The initiative was repeatedly retracted from the 3-deep dispatch queue because it could not stay in the top-3 priority window long enough for an operator push, resulting in no execution and a 10-minute inactivity timeout that triggered the Elrond safety net. (orion)
- Orion remediated: Albus RSI group diagnosis (via INI 999900406): [procedural, confidence high] The initiative blocked because it was set to 'reviewing' by an operator at 2026-06-30 03:57:49, but no executor (Albus or operator) ever performed the code changes described in the initiative; the Elrond safety net triggered after 10+ minutes of inactivity, and subsequent attempts to diagnose the block reused Albus-dependent processes that themselves failed to execute, creating a self-reinforcing procedural deadlock. (orion)
- [status-router] blocked -> closed | event=operator_put | PUT /api/initiatives (operator)
- [rsi-group-cure] Cured by the group diagnosis on INI 999900406 (shared cause stuck_10min_no_activity); retriggered as INI 999900674. Root cause: [procedural, confidence high] The initiative blocked because it was set to 'reviewing' by an operator at 2026-06-30 03:57:49, but no executor (Albus or operator) ever performed the code changes described in the initiative; the Elrond safety net triggered after 10+ minutes of inactivity, and subsequent attempts to diagnose the block reused Albus-dependent processes that themselves failed to execute (elrond.rsi_loop)
✅Success criteria
- The dispatcher refuses to start execution of any step that has no declared evidence-capture mechanism, surfacing a clear rejection reason before the first attempt is made. (must_have)
- The blocked Org prompts initiative advances past the previously blocked step and reaches the next phase without re-entering a blocked state. (must_have)
- All existing pipeline steps that were passing before this change continue to pass after deployment — no regressions introduced. (must_have)
- The pre-check rejects a deliberately evidence-less test step and passes a correctly declared test step in the automated test suite. (must_have)
- A cookbook entry documents the evidence-capture declaration contract so future step authors know what is required at dispatch time. (nice_to_have)