Carolopedia

A friendly guide to Carol, her ecosystem, and the agents who built her.

📖 Carolopedia › Droids › Claude Tester

Claude Tester

Droid Multi-shot test agent using Claude with tools

Go to droid →

📖About & Usage

Owner agent — accountability this droid serves

The agent is responsible for quality engineering — ensuring tests are written, verification runs, and defects are caught. CT-S1 covers the verification slice: automatically testing checklist items by investigating the live system to confirm they work as required.

Droid responsibility

CT-S1 helps the agent verify changes by running automated multi-shot investigations on each checklist item. For items with scripted tests, it runs them. For items without existing tests, it uses Claude with tools (databases, APIs, shell commands, file reads) to gather evidence and reach a pass/fail verdict.

What the droid actually does

Load the checklist template for the change type and identify all items to verify
Run scripted tests where they exist, parse the output, and record results
For items with no scripted tests, prompt Claude to investigate the live system using available tools
Collect evidence (database queries, API responses, file contents, command output) and produce a pass/fail verdict with supporting details
Return structured results including status, summary, and full logs

Boundaries

Cannot write new tests, fix code, or modify the system — investigation only
Database access is read-only
Works only with checklist items defined in the existing mapping
Limited to 20 tool calls and 5 Claude reasoning rounds per run
Individual test scripts time out after 120 seconds; each Claude call after 5 minutes

👤Owner

Argus · Tester

📚Recent initiatives

Initiatives that touched this droid — a short summary each; open one for the full story.

CAROL-INI-0028-00: Worker droid status-string normalization (fix dr-s1 / en-ar-01 0% success)

GAP: po_s1.py treats only status in (success, pass) as worker success. dr-s1 and en-ar-01 return status=ok. Every Archon design run and every Albus remediation run is incorrectly\u2026

Orion · 2026-05-14 17:13

Browse all initiatives →

Last generated: never · Source hash: · JSON