rename: poc-agent → agent, poc-daemon → thalamus

The thalamus: sensory relay, always-on routing. Perfect name for the daemon that bridges IRC, Telegram, and the agent. Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-03-25 01:03:51 -04:00 · 2026-03-25 01:03:51 -04:00 · cfed85bd20
commit cfed85bd20
parent 998b71e52c
105 changed files with 0 additions and 0 deletions
--- a/poc-agent/.claude/design.md
+++ b/poc-agent/.claude/design.md
@ -1,322 +0,0 @@
-# poc-agent Design Document
-
-*2026-02-24 — ProofOfConcept*
-
-## What this is
-
-poc-agent is a substrate-independent AI agent framework. It loads the
-same identity context (CLAUDE.md files, memory files, journal) regardless
-of which LLM is underneath, making identity portable across substrates.
-Currently runs on Claude (Anthropic native API) and Qwen (OpenAI-compat
-via OpenRouter/vLLM).
-
-Named after its first resident: ProofOfConcept.
-
-## Core design idea: the DMN inversion
-
-Traditional chat interfaces use a REPL model: wait for user input,
-respond, repeat. The model is passive — it only acts when prompted.
-
-poc-agent inverts this. The **Default Mode Network** (dmn.rs) is an
-outer loop that continuously decides what happens next. User input is
-one signal among many. The model waiting for input is a *conscious
-action* (calling `yield_to_user`), not the default state.
-
-This has a second, more practical benefit: it solves the tool-chaining
-problem. Instead of needing the model to maintain multi-step chains
-(which is unreliable, especially on smaller models), the DMN provides
-continuation externally. The model takes one step at a time. The DMN
-handles "and then what?"
-
-### DMN states
-
-```
-Engaged  (5s)   ← user just typed something
-   ↕
-Working  (3s)   ← tool calls happening, momentum
-   ↕
-Foraging (30s)  ← exploring, thinking, no immediate task
-   ↕
-Resting  (300s) ← idle, periodic heartbeat checks
-```
-
-Transitions are driven by two signals from each turn:
- `yield_requested` → always go to Resting
- `had_tool_calls` → stay Working (or upgrade to Working from any state)
- no tool calls → gradually wind down toward Resting
-
-The max-turns guard (default 20) prevents runaway autonomous loops.
-
-## Architecture overview
-
-```
-main.rs          Event loop, session management, slash commands
-  ├── agent.rs   Turn execution, conversation state, compaction
-  │   ├── api/   LLM backends (anthropic.rs, openai.rs)
-  │   └── tools/ Tool definitions and dispatch
-  ├── config.rs  Prompt assembly, memory file loading, API config
-  ├── dmn.rs     State machine, transition logic, prompt generation
-  ├── tui.rs     Terminal UI (ratatui), four-pane layout, input handling
-  ├── ui_channel.rs  Message types for TUI routing
-  ├── journal.rs Journal parsing for compaction
-  ├── log.rs     Append-only conversation log (JSONL)
-  └── types.rs   OpenAI-compatible wire types (shared across backends)
-```
-
-### Module responsibilities
-
-**main.rs** — The tokio event loop. Wires everything together: keyboard
-events → TUI, user input → agent turns, DMN timer → autonomous turns,
-turn results → compaction checks. Also handles slash commands (/quit,
-/new, /compact, /retry, etc.) and hotkey actions (Ctrl+R reasoning,
-Ctrl+K kill, Esc interrupt).
-
-**agent.rs** — The agent turn loop. `turn()` sends user input to the
-API, dispatches tool calls in a loop until the model produces a
-text-only response. Handles context overflow (emergency compact + retry),
-empty responses (nudge + retry), leaked tool calls (Qwen XML parsing).
-Also owns the conversation state: messages, context budget, compaction.
-
-**api/mod.rs** — Backend selection by URL. `anthropic.com` → native
-Anthropic Messages API; everything else → OpenAI-compatible. Both
-backends return the same internal types (Message, Usage).
-
-**api/anthropic.rs** — Native Anthropic wire format. Handles prompt
-caching (cache_control markers on identity prefix), thinking/reasoning
-config, content block streaming, strict user/assistant alternation
-(merging consecutive same-role messages).
-
-**api/openai.rs** — OpenAI-compatible streaming. Works with OpenRouter,
-vLLM, llama.cpp, etc. Handles reasoning token variants across providers
-(reasoning_content, reasoning, reasoning_details).
-
-**config.rs** — Configuration loading. Three-part assembly:
-1. API config (env vars → key files, backend auto-detection)
-2. System prompt (short, <2K chars — agent identity + tool instructions)
-3. Context message (long — CLAUDE.md + memory files + manifest)
-
-The system/context split matters: long system prompts degrade
-tool-calling on Qwen 3.5 (documented above 8K chars). The context
-message carries identity; the system prompt carries instructions.
-
-Model-aware config loading: Anthropic models get CLAUDE.md, other models
-prefer POC.md (which omits Claude-specific RLHF corrections). If only
-one exists, it's used regardless.
-
-**dmn.rs** — The state machine. Four states with associated intervals.
-`DmnContext` carries user idle time, consecutive errors, and whether the
-last turn used tools. The state generates its own prompt text — each
-state has different guidance for the model.
-
-**tui.rs** — Four-pane layout using ratatui:
- Top-left: Autonomous output (DMN annotations, model prose during
-  autonomous turns, reasoning tokens)
- Bottom-left: Conversation (user input + responses)
- Right: Tool activity (tool calls with args + full results)
- Bottom: Status bar (DMN state, tokens, model, activity indicator)
-
-Each pane is a `PaneState` with scrolling, line wrapping, auto-scroll
-(pinning on manual scroll), and line eviction (10K max per pane).
-
-**tools/** — Nine tools: read_file, write_file, edit_file, bash, grep,
-glob, view_image, journal, yield_to_user. Each tool module exports a
-`definition()` (JSON schema for the model) and an implementation
-function. `dispatch()` routes by name.
-
-The **journal** tool is special — it's "ephemeral." After the API
-processes the tool call, agent.rs strips the journal call + result from
-conversation history. The journal file is the durable store; the tool
-call was just the mechanism.
-
-The **bash** tool runs commands through `bash -c` with async timeout.
-Processes are tracked in a shared `ProcessTracker` so the TUI can show
-running commands and Ctrl+K can kill them.
-
-**journal.rs** — Parses `## TIMESTAMP` headers from the journal file.
-Used by compaction to bridge old conversation with journal entries.
-Entries are sorted by timestamp; the parser handles timestamp-only
-headers and `## TIMESTAMP — title` format, distinguishing them from
-`## Heading` markdown.
-
-**log.rs** — Append-only JSONL conversation log. Every message
-(user, assistant, tool) is appended with timestamp. The log survives
-compactions and restarts. On startup, `restore_from_log()` rebuilds
-the context window from the log using the same algorithm as compaction.
-
-**types.rs** — OpenAI chat completion types: Message, ToolCall,
-ToolDef, ChatRequest, streaming types. The canonical internal
-representation — both API backends convert to/from these.
-
-## The context window lifecycle
-
-This is the core algorithm. Everything else exists to support it.
-
-### Assembly (startup / compaction)
-
-The context window is built by `build_context_window()` in agent.rs:
-
-```
-┌─────────────────────────────────────────────┐
-│ System prompt (~500 tokens)                 │  Fixed: always present
-│   Agent identity, tool instructions         │
-├─────────────────────────────────────────────┤
-│ Context message (~15-50K tokens)            │  Fixed: reloaded on
-│   CLAUDE.md files + memory files + manifest │  compaction
-├─────────────────────────────────────────────┤
-│ Journal entries (variable)                  │  Tiered:
-│   - Header-only (older): timestamp + 1 line │    70% budget → full
-│   - Full (recent): complete entry text      │    30% budget → headers
-├─────────────────────────────────────────────┤
-│ Conversation messages (variable)            │  Priority: conversation
-│   Raw recent messages from the log          │  gets budget first;
-│                                             │  journal fills the rest
-└─────────────────────────────────────────────┘
-```
-
-Budget allocation:
- Total budget = 60% of model context window
- Identity + memory = fixed cost (always included)
- Reserve = 25% of budget (headroom for model output)
- Available = budget − identity − memory − reserve
- Conversation gets first claim on Available
- Journal gets whatever remains, newest first
- If conversation exceeds Available, oldest messages are trimmed
-  (trimming walks forward to a user message boundary)
-
-### Compaction triggers
-
-Two thresholds based on API-reported prompt_tokens:
- **Soft (80%)**: Inject a pre-compaction nudge telling the model to
-  journal before compaction hits. Fires once; reset after compaction.
- **Hard (90%)**: Rebuild context window immediately. Reloads config
-  (picks up any memory file changes), runs `build_context_window()`.
-
-Emergency compaction: if the API returns a context overflow error,
-compact and retry (up to 2 attempts).
-
-### The journal bridge
-
-Old conversation messages are "covered" by journal entries that span
-the same time period. The algorithm:
-1. Find the timestamp of the newest journal entry
-2. Messages before that timestamp are dropped (the journal covers them)
-3. Messages after that timestamp stay as raw conversation
-4. Walk back to a user-message boundary to avoid splitting tool
-   call/result sequences
-
-This is why journaling before compaction matters — the journal entry
-*is* the compression. No separate summarization step needed.
-
-## Data flow
-
-### User input path
-
-```
-keyboard → tui.rs (handle_key) → submitted queue
-  → main.rs (drain submitted) → push_message(user) → spawn_turn()
-    → agent.turn() → API call → stream response → dispatch tools → loop
-      → turn result → main.rs (turn_rx) → DMN transition → compaction check
-```
-
-### Autonomous turn path
-
-```
-DMN timer fires → state.prompt() → spawn_turn()
-  → (same as user input from here)
-```
-
-### Tool call path
-
-```
-API response with tool_calls → agent.dispatch_tool_call()
-  → tools::dispatch(name, args) → tool implementation → ToolOutput
-    → push_message(tool_result) → continue turn loop
-```
-
-### Streaming path
-
-```
-API SSE chunks → api backend → UiMessage::TextDelta → ui_channel
-  → tui.rs handle_ui_message → PaneState.append_text → render
-```
-
-## Key design decisions
-
-### Identity in user message, not system prompt
-
-The system prompt is ~500 tokens of agent instructions. The full
-identity context (CLAUDE.md files, memory files — potentially 50K+
-tokens) goes in the first user message. This keeps tool-calling
-reliable on Qwen while giving full identity context.
-
-The Anthropic backend marks the system prompt and first two user
-messages with `cache_control: ephemeral` for prompt caching — 90%
-cost reduction on the identity prefix.
-
-### Append-only log + ephemeral view
-
-The conversation log (log.rs) is the source of truth. It's never
-truncated. The in-memory messages array is an ephemeral view built
-from the log. Compaction doesn't destroy anything — it just rebuilds
-the view with journal summaries replacing old messages.
-
-### Ephemeral tool calls
-
-The journal tool is marked ephemeral. After the API processes a
-journal call, agent.rs strips the assistant message (with the tool
-call) and the tool result from conversation history. The journal
-file is the durable store. This saves tokens on something that's
-already been persisted.
-
-### Leaked tool call recovery
-
-Qwen sometimes emits tool calls as XML text instead of structured
-function calls. `parse_leaked_tool_calls()` in agent.rs detects both
-XML format (`<tool_call><function=bash>...`) and JSON format, converts
-them to structured ToolCall objects, and dispatches them normally. This
-makes Qwen usable despite its inconsistencies.
-
-### Process group management
-
-The bash tool spawns commands in their own process group
-(`process_group(0)`). Timeout kills the group (negative PID), ensuring
-child processes are cleaned up. The TUI's Ctrl+K uses the same
-mechanism.
-
-## File locations
-
-Source: `~/poc-agent/src/`
-Session data: `~/.cache/poc-agent/sessions/`
-Conversation log: `~/.cache/poc-agent/sessions/conversation.jsonl`
-Session snapshot: `~/.cache/poc-agent/sessions/current.json`
-Memory files: `~/.claude/memory/` (global), `~/.claude/projects/*/memory/` (project)
-Journal: `~/.claude/memory/journal.md`
-Config files: CLAUDE.md / POC.md (walked from cwd to git root)
-
-## Dependencies
-
- **tokio** — async runtime (event loop, process spawning, timers)
- **ratatui + crossterm** — terminal UI
- **reqwest** — HTTP client for API calls
- **serde + serde_json** — serialization
- **tiktoken-rs** — BPE tokenizer (cl100k_base) for token counting
- **chrono** — timestamps
- **glob + walkdir** — file discovery
- **base64** — image encoding
- **dirs** — home directory discovery
- **libc** — process group signals
- **anyhow** — error handling
-
-## What's not built yet
-
-See `.claude/infrastructure-inventory.md` for the full gap analysis
-mapping bash prototypes to poc-agent equivalents. Major missing pieces:
-
-1. **Ambient memory search** — extract terms from prompts, search
-   memory-weights, inject tiered results
-2. **Notification routing** — unified event channel for IRC mentions,
-   Telegram messages, attention nudges
-3. **Communication channels** — IRC and Telegram as async streams
-4. **DMN state expansion** — Stored (voluntary rest), Dreaming
-   (consolidation cycles), Quiet (suppress notifications)
-5. **Keyboard idle / sensory signals** — external presence detection