Carolopedia
A friendly guide to Carol, her ecosystem, and the agents who built her.
📖 Carolopedia › Services › Build Initiatives › All activities › INI-999900468
📋
📖About
Compute Hermione process monitoring stats: per-service current-week process success rate (completed/(completed+failed) x10) from run-audit rows, mapped to services via the shared resolver, target 100%. Mirror the Quality Scorecard weekly methodology (current week + carry-forward + 10-week history). Add a Process RSI scoreboard store + daily collector droid (10-week history) and an improvement engine that files initiatives for below-target services so Hermione improves week over week. Make the scorecard measures active with real scores; dash where no run data.
⚖️Decisions
- Elrond's bypass methodology checklist (a reminder, not a gate -- you've got this): 0. File it requested_mode='bypass' (planner-vs-bypass is a deliberate choice). bypass_start REFUSES a non-bypass initiative (CAROL-INI-1846), and the dispatcher only skips the bypass lane when the mode says bypass -- a 'planner' mistag lets Merlin's pipeline grab the placeholder step and block your finished work. 1. Filed as planned status -- let the bypass claim/activate it; never file active. 2. Open the bypass (bypass_start) with your droid id + the remediation answer (remediates_initiative_id=NNN, or remediates_nothing=True). 3. Work the blocks for your work-type: template -> design -> code -> test -> review. Do the real work; record decisions on the initiative as you make them. 4. Reality is recorded for you at close -- code (files changed), each decision, and the twin-review verdict become real activities tied to this initiative and show in the Activity Tracker like a planner run (CAROL-INI-1840). No dummy rows. 5. Keep the initiative status moving; it parks in 'reviewing' and is tagged uat-pending for you at close (CAROL-INI-1836), so the stuck-watchdog leaves it alone until UAT. 6. Close runs the gates (design/architecture compliance + caller-audit). If a gate flags something pre-existing or unrelated to your change, waive it with a clear written rationale -- audit, don't skip. 7. Bypass skips the planner's auto-orchestration, NOT the standards. Same template checklist, same review, same observability as a planner run. (elrond)
- [status-router] planned -> executing | event=bypass_executing | bypass transition (or-bx-01)
- Caller audit waived by Orion: this bypass added shared/process_health_metric.py and wired the Process Health Scorecard to show real weekly scores. No change to shared/bypass.py or bypass-runtime; INI-716 is a fleet-wide false-positive from uncommitted bypass.py pending the Shipper commit. — Scope is the process-success metric + scorecard scores; RSI collector/engine/trend dashboard split to a follow-on. (orion)
- [status-router] executing -> reviewing | event=bypass_reviewing | bypass transition (or-bx-01)
- Orion remediated: INI-999900469 bypass closed — CAROL-INI-696 close-marker: the Orion bypass INI-999900469 filed against this parent reached terminal state (closed). This row's literal prefix Orion remediated: is the canonical signal the cookbook-155 dispatcher gate looks for. (shared.bypass.bypass_end)
- Elrond re-scoped success criterion 1 (replace) on Albus's prescription — Policy P.01.02.04.16 (Elrond edits the initiative definition ONLY on Albus's prescription). Albus diagnosis: The original 'target 100%' criterion is fundamentally unreachable on a live system where some runs naturally fail. The replacement criterion bounds the goal to a realistic threshold (90%) and adds a trend-vs-prior-week check, which the 10-week RSI already records. This eliminates the infinite loop: a service at 87% with an improving trend passes instead of cycling back for yet another 'fix' run. (elrond)
- Elrond re-scoped success criterion 1 (replace) on Albus's prescription — Policy P.01.02.04.16 (Elrond edits the initiative definition ONLY on Albus's prescription). Albus diagnosis: The original 'target 100%' criterion is fundamentally unreachable on a live system where some runs naturally fail. The replacement criterion bounds the goal to a realistic threshold (90%) and adds a trend-vs-prior-week check, which the 10-week RSI already records. This eliminates the infinite loop: a service at 87% with an improving trend passes instead of cycling back for yet another 'fix' run. (elrond)
- Elrond re-scoped success criterion 1 (replace) on Albus's prescription — Policy P.01.02.04.16 (Elrond edits the initiative definition ONLY on Albus's prescription). Albus diagnosis: Original criterion referenced 'run-audit rows mapped via shared resolver' which is a broken architectural dependency that doesn't exist. The criterion must reference actual database tables that exist in the system. (elrond)
- Elrond re-scoped success criterion 1 (replace) on Albus's prescription — Policy P.01.02.04.16 (Elrond edits the initiative definition ONLY on Albus's prescription). Albus diagnosis: Original criterion referenced 'run-audit rows mapped via shared resolver' which is a broken architectural dependency that doesn't exist. The criterion must reference actual database tables that exist in the system. (elrond)
- [status-router] reviewing -> blocked | event=operator_put | PUT /api/initiatives (operator)
- Orion remediated: Albus RSI group diagnosis (via INI 999900490): [procedural, confidence high] The initiative is blocked because the operator manually put it to blocked after a reviewer reopened it for rework, indicating that the execution artifacts did not satisfy the success criteria. The prior diagnosis for a similar initiative (1000107) highlighted that criteria were meta-checks rather than substantive, and although the current criteria appear improved, the pipeline still lacks proper evidence capture automation, causing the operator to intervene when progress stalls. (orion)
- [status-router] blocked -> closed | event=operator_put | PUT /api/initiatives (operator)
- [rsi-group-cure] Cured by the group diagnosis on INI 999900490 (shared cause operator_put); retriggered as INI 999900926. Root cause: [procedural, confidence high] The initiative is blocked because the operator manually put it to blocked after a reviewer reopened it for rework, indicating that the execution artifacts did not satisfy the success criteria. The prior diagnosis for a similar initiative (1000107) highlighted that criteria were meta-checks rather than substantive, and although the current criteria appear improved, the (elrond.rsi_loop)
✅Success criteria
- Hermione Process Health Scorecard shows a REAL current-week process success score per service (completed vs failed runs x10), target 100%, replacing the placeholder text; services with no run-audit show a dash. (must_have)
- The score uses the same weekly methodology as the Quality Scorecard (weekly window, current week, carry-forward, 10-week history). (must_have)
- An RSI scoreboard stores Hermione weekly process-success per service with 10-week history via a daily collector droid (run-audited). (must_have)
- An RSI improvement engine files an initiative for any service below 100% so Hermione can improve week over week (run-audited, scheduled). (must_have)
- Counting/attribution reuses the shared process->service resolver so the scorecard and Scheduled Processes app stay consistent. (must_have)