Carol — back to Apps ← Apps

Carolopedia

A friendly guide to Carol, her ecosystem, and the agents who built her.

📖 CarolopediaServicesBuild InitiativesAll activitiesINI-1000393
📋

CAROL-INI-0459-00: Enable document uploads in Carol Chat web interface

Initiative
Open in Initiatives →

📖About

Carol's web chat (carol-chat app) currently accepts only text messages. Enable users to upload documents — images (jpg/png/gif/webp), PDFs, and text-like docs (txt/md/docx) — so Carol can answer questions about them, summarise, or store them for later reference.

DECISIONS:

  • Single file picker in chat input area (paperclip icon next to send), multi-file accept.
  • Per-user storage under users///uploads//_; mirrored to linked WhatsApp folder if linked.
  • 25 MB per file (Claude API attachment limit); validate MIME by extension AND magic bytes.
  • Carol's brain receives the file in the message: images inlined as base64 image blocks, text/PDF extracted to text block. Single-file path first; multi-file path can follow.
  • Stranger mode: uploads work but are NOT mirrored to admin's WhatsApp folder.
  • Conversation persistence: a new attachments table on chat.db rows linked to the message id; index.html renders thumbnail for images, filename chip for docs.

REQUIREMENTS:

  • POST endpoint accepts multipart/form-data with up to 5 files per message.
  • Files validated, stored, recorded in chat.db, mirrored to linked WhatsApp folder under same conversation key.
  • Frontend shows pending upload chips before send; send button uses multipart when files present.
  • Carol's brain receives attachments in its turn input and references them naturally in her reply.
  • Existing pure-text path unchanged (zero regression on the smoke test).

SUCCESS CRITERIA:

  • Upload + send an image → Carol describes it.
  • Upload + send a PDF → Carol summarises or answers a question about its contents.
  • Stranger mode upload → file lands in stranger namespace, never in admin's linked folder.
  • Existing carol-chat tests still pass.

STRATEGY:

  • Backend first: new POST /api/conversations/{id}/messages-with-files endpoint OR augment existing endpoint to accept multipart; storage helper; chat.db attachments table; pass-through to Carol's brain via a new extra_context['attachments'] payload.
  • Brain hook: prompt builder / agent_loop reads attachments from extra_context and emits API-shaped image / document blocks.
  • Frontend: paperclip icon, file input, pending-chip UI, multipart send.
  • Tests: pytest for endpoint + brain plumbing; manual browser test for full flow.

PLAN STEPS: 1. Read carol-chat app.py + index.html + Carol's prompt builder + agent_loop to map current send path and find the hook points. 2. Add attachments table to chat.db + schema migration. 3. Implement upload endpoint + storage helper + MIME validation. 4. Wire attachments through to Carol's brain via extra_context. 5. Frontend paperclip + pending chips + multipart send. 6. Tests: endpoint test, brain-plumbing test, regression-baseline test for text-only path. 7. Manual browser test in stranger mode + admin mode.

BUDGET: $5.00 (multi-file feature touching UI + backend + brain + tests).

⚖️Decisions

  • Parked by Albus pending bypass INI 1000394. Once that closes, re-trigger this as a follow-on of itself to inherit the original scope and resume. (albus)
  • Escalated by autonomous loop: bypass INI 1000394 is blocked; parked INI 1000393 cannot resume. Operator triage needed. (albus)
  • requester rewritten ninad -> orion per CAROL-INI-744: orion is the only human-CLI requester — Backfill of historical rows after INI744 added API-level refusal of requester=ninad. Orion is Ninads CLI agent; all human-originated initiatives are filed with requester=orion. (orion)