Carolopedia
A friendly guide to Carol, her ecosystem, and the agents who built her.
📖About
Carol's web chat (carol-chat app) currently accepts only text messages. Enable users to upload documents — images (jpg/png/gif/webp), PDFs, and text-like docs (txt/md/docx) — so Carol can answer questions about them, summarise, or store them for later reference.
DECISIONS:
- Single file picker in chat input area (paperclip icon next to send), multi-file accept.
- Per-user storage under users/
/ /uploads/ / _ ; mirrored to linked WhatsApp folder if linked. - 25 MB per file (Claude API attachment limit); validate MIME by extension AND magic bytes.
- Carol's brain receives the file in the message: images inlined as base64 image blocks, text/PDF extracted to text block. Single-file path first; multi-file path can follow.
- Stranger mode: uploads work but are NOT mirrored to admin's WhatsApp folder.
- Conversation persistence: a new attachments table on chat.db rows linked to the message id; index.html renders thumbnail for images, filename chip for docs.
REQUIREMENTS:
- POST endpoint accepts multipart/form-data with up to 5 files per message.
- Files validated, stored, recorded in chat.db, mirrored to linked WhatsApp folder under same conversation key.
- Frontend shows pending upload chips before send; send button uses multipart when files present.
- Carol's brain receives attachments in its turn input and references them naturally in her reply.
- Existing pure-text path unchanged (zero regression on the smoke test).
SUCCESS CRITERIA:
- Upload + send an image → Carol describes it.
- Upload + send a PDF → Carol summarises or answers a question about its contents.
- Stranger mode upload → file lands in stranger namespace, never in admin's linked folder.
- Existing carol-chat tests still pass.
STRATEGY:
- Backend first: new POST /api/conversations/{id}/messages-with-files endpoint OR augment existing endpoint to accept multipart; storage helper; chat.db attachments table; pass-through to Carol's brain via a new extra_context['attachments'] payload.
- Brain hook: prompt builder / agent_loop reads attachments from extra_context and emits API-shaped image / document blocks.
- Frontend: paperclip icon, file input, pending-chip UI, multipart send.
- Tests: pytest for endpoint + brain plumbing; manual browser test for full flow.
PLAN STEPS: 1. Read carol-chat app.py + index.html + Carol's prompt builder + agent_loop to map current send path and find the hook points. 2. Add attachments table to chat.db + schema migration. 3. Implement upload endpoint + storage helper + MIME validation. 4. Wire attachments through to Carol's brain via extra_context. 5. Frontend paperclip + pending chips + multipart send. 6. Tests: endpoint test, brain-plumbing test, regression-baseline test for text-only path. 7. Manual browser test in stranger mode + admin mode.
BUDGET: $5.00 (multi-file feature touching UI + backend + brain + tests).
⚖️Decisions
- Parked by Albus pending bypass INI 1000394. Once that closes, re-trigger this as a follow-on of itself to inherit the original scope and resume. (albus)
- Escalated by autonomous loop: bypass INI 1000394 is blocked; parked INI 1000393 cannot resume. Operator triage needed. (albus)
- requester rewritten ninad -> orion per CAROL-INI-744: orion is the only human-CLI requester — Backfill of historical rows after INI744 added API-level refusal of requester=ninad. Orion is Ninads CLI agent; all human-originated initiatives are filed with requester=orion. (orion)