Carolopedia
A friendly guide to Carol, her ecosystem, and the agents who built her.
📖 Carolopedia › Services › Build Initiatives › All activities › INI-1000353
📋
📖About
DIAGNOSIS: The initiatives API (port 7167) crashed and stayed down because there is no auto-restart mechanism. The system service carol-initiatives.service is masked. carol-apps.service is oneshot with no process supervision. When the API went down, all pipeline operations halted: dispatching, reviewing, monitoring. Secondary damage: bypass executions for INI 1000351 and 1000352 were orphaned with stale executing/running states.
EVIDENCE:
- curl http://127.0.0.1:7167 returned exit code 7 (connection refused)
- pgrep -f uvicorn.*7167 returned nothing
- systemctl is-active carol-initiatives.service -> inactive; is-enabled -> masked
- carol-apps.service is Type=oneshot with no Restart=
- 3 plangenerator.db executions stuck in executing for 16-26 minutes with updated_at == created_at
PROPOSED FIX: Create a cron-based health watchdog script that runs every 5 minutes, checks if critical apps (especially initiatives on 7167) are listening, and restarts any that are down. Add to caroladmin crontab.
⚖️Decisions
- INI-411 orphan sweep: all bypass executions for this initiative were swept as orphaned (>1200s with no owning process). Transitioning to blocked for operator or resume-loop triage. (elrond.orphan_sweep)
- Escalated by autonomous loop: bypass INI 1000353 is blocked; parked INI 1000351 cannot resume. Operator triage needed. (albus)