Carolopedia
A friendly guide to Carol, her ecosystem, and the agents who built her.
📖About
Carol-internal infra. Build a single shared "middleman" service (shared/api_router.py) that all Carol code uses for external API calls (LLM, image, embedding, ...). Service owns: vendor preference list per category, exponential-backoff retries on 429, automatic failover to the next vendor on quota or auth failure, persistent deferred_calls table for cross-vendor exhaustion, structured response with engine_used + cost_usd. Designer (Archon, agt_002) owns the routing droid that defines per-category prompts and vendor preferences. Merlin (agt_020) routes image-generation activities to Designer per a new policy. Image Generator app moves from Loki (agt_019) to Archon (agt_002) since rendering is a design-craft concern, not marketing. Unblocks INI-077 (Carolopedia bulk run) by removing single-vendor failure modes. Owner: Orion (build); runtime ownership: Archon (Designer).
⚖️Decisions
- Follow-on to parent INI 999900516 (orion)
- Scope inherited verbatim from parent INI 999900516 per CAROL-INI-361. (elrond.initiative_author)
- Criteria refinement (CAROL-INI-509): Refined: Original criterion 'Owner: Orion (build); runtime ownership: Archon (Designer)' struck because Orion is a human-CLI identity per cookbook #172, not an agent owner for build; runtime ownership model is now via Albus executor. (elrond.initiative_author)
- Criteria refinement (CAROL-INI-509): Refined: Struck specific references to 'Archon/agt_002' and 'Loki/agt_019' because neither agent exists in the current registry (43 agents, no Archon or Loki listed). The concept of Designer remains but agent assignment is handled by new step. (elrond.initiative_author)
- Criteria refinement (CAROL-INI-509): Refined: Struck specific reference to 'Merlin (agt_020)' routing policy because present-day cookbook #76 and #32 show Merlin's routing is template-based and the policy is described in step-owner terms, not agent ID. (elrond.initiative_author)
- Criteria refinement (CAROL-INI-509): Refined: Struck specific droid name 'ar_ir_01' because no such droid exists in the registry; the routing droid is described generically in the new step. (elrond.initiative_author)
- [status-router] planned -> dispatched | event=dispatch | RSI: auto-promoted bypasses depth limit (CAROL-INI-2198) (spb-01)
- [status-router] dispatched -> blocked | event=stuck_10min_no_activity | Elrond safety net: initiative has had no activity for 10+ minutes. Blocking under the parallel safety mechanism. (el-watchdog)
- Elrond blocked initiative under the CAROL-INI-2162 dead-Albus protocol. Albus was supposed to wake for step 0 (cause=albus_no_show) but did not respond. Cause: albus_no_show. Reason: Elrond safety net: initiative stranded 10+ min. Albus wake failed or produced no useful result. (el-s1)
- Orion remediated: Albus RSI group diagnosis (via INI 999900649): [infra, confidence high] The initiative blocked because the Albus agent failed to respond to the wake call for step 0, triggering the dead-Albus protocol under the parallel safety mechanism. No execution history exists, and the block event occurred immediately after dispatch, indicating a procedural/infra failure in the wake mechanism rather than a work failure. (orion)
- [status-router] blocked -> closed | event=operator_put | PUT /api/initiatives (operator)
- [rsi-group-cure] Cured by the group diagnosis on INI 999900649 (shared cause stuck_10min_no_activity); retriggered as INI 999900896. Root cause: [infra, confidence high] The initiative blocked because the Albus agent failed to respond to the wake call for step 0, triggering the dead-Albus protocol under the parallel safety mechanism. No execution history exists, and the block event occurred immediately after dispatch, indicating a procedural/infra failure in the wake mechanism rather than a work failure. (elrond.rsi_loop)
✅Success criteria
- A step-0 handler exists on initiative creation that triggers an immediate wake call to the assigned executor (Albus) to prevent procedural deadlock. (must_have)
- Image Generator app is reassigned to the Designer agent (Archon). (must_have)
- A Carol policy is published requiring all external API calls (LLM, image, embedding) to go through the shared API router. (must_have)
- A Carol policy is published stating that Merlin routes image-generation activities to the Designer routing droid. (must_have)
- shared/api_router.py skeleton exists and includes: vendor preference list per category, exponential-backoff retries on 429, automatic failover to next vendor on quota or auth failure. (must_have)
- Image generation category is wired into the shared API router's vendor selection and fallback logic. (must_have)
- A persistent deferred_calls table and a retrier hook are created to handle cross-vendor exhaustion. (must_have)
- The Designer agent has a routing droid that defines per-category prompts and vendor preferences. (must_have)
- The /generate/portrait endpoint routes through the shared API router. (must_have)
- INI-077 dependency is declared in the initiative metadata. (nice_to_have)
- A smoke test passes: four portraits are successfully generated via the new API router. (must_have)
- A transient watchdog alerts if no executor activity is detected within 5 minutes of dispatch. (must_have)
- The initiative is explicitly transitioned to 'completed' status after all steps are verified done. (must_have)