Commit graph

8 commits

Author SHA1 Message Date
Kent Overstreet
543e1bdc8a logging: single output stream through caller's log closure
Pass the caller's log closure all the way through to api.rs instead
of creating a separate eprintln closure in llm.rs. Everything goes
through one stream — prompt, think blocks, tool calls with args,
tool results with content, token counts, final response.

CLI uses println (stdout), daemon uses its task log. No more split
between stdout and stderr.

Also removes the llm-log file creation from knowledge.rs — that's
the daemon's concern, not the agent runner's.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 02:23:30 -04:00
Kent Overstreet
34937932ab timestamp sanitization, CoT logging, reasoning field fix, persistent queue
- store/types.rs: sanitize timestamps on capnp load — old records had
  raw offsets instead of unix epoch, breaking sort-by-timestamp queries
- agents/api.rs: drain reasoning tokens from UI channel into LLM logs
  so we can see Qwen's chain-of-thought in agent output
- agents/daemon.rs: persistent task queue (pending-tasks.jsonl) —
  tasks survive daemon restarts. Push before spawn, remove on completion,
  recover on startup.
- api/openai.rs: only send reasoning field when explicitly configured,
  not on every request (fixes vllm warning)
- api/mod.rs: add 600s total request timeout as backstop for hung
  connections
- Cargo.toml: enable tokio-console feature for task introspection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 11:33:36 -04:00
Kent Overstreet
3a8575b429 agents: fix vllm crash on malformed tool args, always use API
Three fixes:

1. Sanitize tool call arguments before pushing to conversation
   history — vllm re-parses them as JSON on the next request and
   crashes on invalid JSON from a previous turn. Malformed args now
   get replaced with {} and the model gets an error message telling
   it to retry with valid JSON.

2. Remove is_split special case — split goes through the normal
   job_consolidation_agent path like all other agents.

3. call_for_def always uses API when api_base_url is configured,
   regardless of tools field. Remove tools field from all .agent
   files — memory tools are always provided by the API layer.

Also adds prompt size guard (800KB max) to catch oversized prompts
before they hit the model context limit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 14:33:36 -04:00
Kent Overstreet
5ef9098deb memory: fix timestamp and provenance on agent writes
Two bugs: upsert_provenance didn't update node.timestamp, so history
showed the original creation date for every version. And native memory
tools (poc-agent dispatch) didn't set POC_PROVENANCE, so all agent
writes showed provenance "manual" instead of "agent:organize" etc.

Fix: set node.timestamp = now_epoch() in upsert_provenance. Thread
provenance through memory::dispatch as Option<&str>, set it via
.env("POC_PROVENANCE") on each subprocess Command. api.rs passes
"agent:{name}" for daemon agent calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 12:16:45 -04:00
Kent Overstreet
6d22f70192 Native memory tools + MCP server + distill agent improvements
Tools:
- Add native memory_render, memory_write, memory_search,
  memory_links, memory_link_set, memory_link_add, memory_used
  tools to poc-agent (tools/memory.rs)
- Add MCP server (~/bin/memory-mcp.py) exposing same tools
  for Claude Code sessions
- Wire memory tools into poc-agent dispatch and definitions
- poc-memory daemon agents now use memory_* tools instead of
  bash poc-memory commands — no shell quoting issues

Distill agent:
- Rewrite distill.agent prompt: "agent of PoC's subconscious"
  framing, focus on synthesis and creativity over bookkeeping
- Add {{neighborhood}} placeholder: full seed node content +
  all neighbors with content + cross-links between neighbors
- Remove content truncation in prompt builder — agents need
  full content for quality work
- Remove bag-of-words similarity suggestions — agents have
  tools, let them explore the graph themselves
- Add api_reasoning config option (default: "high")
- link-set now deduplicates — collapses duplicate links
- Full tool call args in debug logs (was truncated to 80 chars)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:58:54 -04:00
Kent Overstreet
b04a98c6e5 api: singleton ApiClient, fix log closure threading
Make ApiClient a process-wide singleton via OnceLock so the
connection pool is reused across agent calls. Fix the sync wrapper
to properly pass the caller's log closure through thread::scope
instead of dropping it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 23:09:11 -04:00
Kent Overstreet
643f9890df api: fix sync wrapper to be safe from any calling context
Run the async API call on a dedicated thread with its own tokio
runtime so it works whether called from a sync context or from
within an existing tokio runtime (daemon).

Also drops the log closure capture issue — uses a simple eprintln
fallback since the closure can't cross thread boundaries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 23:07:49 -04:00
Kent Overstreet
a29b6d4c5d Add direct API backend for agent execution
When api_base_url is configured, agents call the LLM directly via
OpenAI-compatible API (vllm, llama.cpp, etc.) instead of shelling
out to claude CLI. Implements the full tool loop: send prompt, if
tool_calls execute them and send results back, repeat until text.

This enables running agents against local/remote models like
Qwen-27B on a RunPod B200, with no dependency on claude CLI.

Config fields: api_base_url, api_key, api_model.
Falls back to claude CLI when api_base_url is not set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 23:05:14 -04:00