Carol — back to Apps ← Apps

Carolopedia

A friendly guide to Carol, her ecosystem, and the agents who built her.

📖 CarolopediaServicesBuild InitiativesAll activitiesINI-1000506
📋

CAROL-INI-0555-00: Fix call_claude_raw/call_claude: detect transient Not-logged-in CLI error and retry after delay

Initiative
Open in Initiatives →

📖About

ROOT CAUSE: shared/claude.py call_claude_raw (line 335) and call_claude (line 233) do not detect the Claude CLI transient auth failure "Not logged in · Please run /login". When a session token expires, the CLI returns this 33-char string instantly (zero API latency). call_claude_raw returns it as normal text. The developer droid (ex_dev_01) sees no executable code and burns 3 retry attempts in <1 second, all hitting the same instant error. PO-S1 marks the pipeline FAILED, exec goes to failed, dispatch_queue row goes to needs_attention, and stays there permanently.

EVIDENCE: 10 consecutive failed executions (IDs 1613-1639) between 14:09 and 16:40 on 2026-05-18, all with the same 33-char response "Not logged in · Please run /login". All 3 retry attempts per exec complete in <1s total. 8+ initiatives affected. Pipeline has been completely non-functional for code generation for over 2 hours.

FIX: In call_claude_raw, after reading stdout (line 416), check if the output matches the "Not logged in" pattern. If so, sleep 30s (auth tokens auto-refresh) and retry the CLI call once. Return the retry result. Same pattern in call_claude. This is analogous to the existing rate-limit detection at line 442 — same defensive pattern, different transient error class.

⚖️Decisions

  • Handover-watchdog: gap-H dispatched planned bypass INI 1000506. (elrond)