Commit graph

39 commits

Author SHA1 Message Date
Kent Overstreet
d451b69196 Fix XML tool call parsing: try JSON parse for parameter values
Parameter values like ["key1", "key2"] were being wrapped as strings
instead of parsed as JSON arrays. Tools expecting array arguments
(like memory_search) got a string containing the array literal.

Now tries serde_json::from_str first, falls back to String.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 18:52:10 -04:00
Kent Overstreet
785dea9b9b Update EBNF grammar comment for tool_result format
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 18:43:50 -04:00
Kent Overstreet
8e5747ff43 Fix tool result format: Qwen expects <tool_response> in user role
Qwen's chat template renders tool results as:
  <|im_start|>user\n<tool_response>\n{content}\n</tool_response><|im_end|>

We were rendering as:
  <|im_start|>tool\n{content}<|im_end|>

The model never saw <|im_start|>tool in training, so it ignored our
tool results and looped retrying the same call. Found by comparing
our tokenization against vLLM's /tokenize endpoint with chat messages.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 18:42:47 -04:00
Kent Overstreet
8bf6753949 Debug: add context token count to parser log, fix compact() tool defs
compact() was clearing tool definitions from the system section on
startup — now leaves system section untouched (set once by new()).
Added context token count to parser done log for diagnosing the
subconscious agent loop issue.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 17:57:10 -04:00
Kent Overstreet
fc75b181cf Fix: compact() was clearing tool definitions from system section
compact() cleared and rebuilt the system section but only pushed the
system prompt — tool definitions were lost. Since new() sets up the
system section correctly (prompt + tools), compact() now only reloads
identity and journal, leaving system untouched.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 17:48:10 -04:00
Kent Overstreet
d4d661df5b Parser debug logging to /tmp/poc-{agent_name}.log
Logs full response text when no tool calls detected, tool call
bodies when found. Per-agent log files for debugging subconscious
agent parsing issues.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 17:39:55 -04:00
Kent Overstreet
473909db47 Add parser debug logging (POC_DEBUG=1)
Logs full text length, <tool_call> tag count, and tool call details
on stream completion. Helps diagnose parsing issues with subconscious
agents.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 17:38:02 -04:00
Kent Overstreet
119dc8c146 Store trimmed text in Content and Thinking nodes
Was checking trim but storing untrimmed. Now stores the trimmed
version — no leading/trailing whitespace in the AST.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 17:25:47 -04:00
Kent Overstreet
01bbc39a31 Drop whitespace-only content nodes from parser output
Content between tags (e.g. newlines between </think> and <tool_call>)
was creating empty Content nodes. Now trimmed before creating the node.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 17:21:34 -04:00
Kent Overstreet
1b6664ee1c Fix: skip empty CoT nodes, expand AST children in conscious screen, timestamps
Parser skips Thinking nodes that are just whitespace. Conscious screen
now shows assistant children (Content, Thinking, ToolCall) as nested
tree items via recursive node_to_view. Nodes get timestamped in
push_node and on assistant branch creation.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 17:18:48 -04:00
Kent Overstreet
5ec2ff95d8 Fix parser: re-encode tokens instead of tracking model IDs through tag splits
The parser can't reliably split model-produced token IDs at tag
boundaries (<think>, <tool_call>) because BPE tokens can span across
tags. Instead, each leaf gets re-encoded from its text content via
the local tokenizer. This gives clean token boundaries aligned with
semantic structure — better for budgeting and potentially for the
model during fine-tuning.

Also skip serializing token_ids to conversation log (they're cached
state, recomputed on construction).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 17:08:42 -04:00
Kent Overstreet
2c401e24d6 Parser consumes stream directly, yields tool calls via channel
ResponseParser::run() spawns a task that reads StreamTokens, parses
into the AST (locking context per token), and sends PendingToolCalls
through a channel. Returns (tool_rx, JoinHandle<Result>) — the turn
loop dispatches tool calls and awaits the handle for error checking.

Token IDs from vLLM are accumulated alongside text and stored directly
on AST leaves — no local re-encoding on the response path.

The turn loop no longer matches on individual stream events. It just
reads tool calls and dispatches them.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 16:32:00 -04:00
Kent Overstreet
bf3e2a9b73 WIP: Rename context_new → context, delete old files, fix UI layer
Renamed context_new.rs to context.rs, deleted context_old.rs,
types.rs, openai.rs, parsing.rs. Updated all imports. Rewrote
user/context.rs and user/widgets.rs for new types. Stubbed
working_stack tool. Killed tokenize_conv_entry.

Remaining: mind/mod.rs, mind/dmn.rs, learn.rs, chat.rs,
subconscious.rs, oneshot.rs.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 15:20:26 -04:00
Kent Overstreet
67e3228c32 Kill tiktoken — all token counting now uses Qwen 3.5 tokenizer
Remove tiktoken-rs dependency, CoreBPE field on Agent, and the
msg_token_count() function. All tokenization now goes through the
global HuggingFace tokenizer in agent/tokenizer.rs.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 11:25:28 -04:00
Kent Overstreet
5e4067c04f Replace token counting with token generation via HuggingFace tokenizer
Add agent/tokenizer.rs with global Qwen 3.5 tokenizer that generates
actual token IDs including chat template wrapping. ContextEntry now
stores token_ids: Vec<u32> instead of tokens: usize — the count is
derived from the length.

ContextEntry::new() tokenizes automatically via the global tokenizer.
ContextSection::push_entry() takes a raw ConversationEntry and
tokenizes it. set_message() re-tokenizes without needing an external
tokenizer parameter.

Token IDs include the full chat template: <|im_start|>role\ncontent
<|im_end|>\n — so concatenating token_ids across entries produces a
ready-to-send prompt for vLLM's /v1/completions endpoint.

The old tiktoken CoreBPE is now unused on Agent (will be removed in
a followup). Token counts are now exact for Qwen 3.5 instead of the
~85-90% approximation from cl100k_base.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 11:20:03 -04:00
Kent Overstreet
e0ee441aec Add ConversationEntry::Thinking — 0 tokens, not sent to API
Thinking/reasoning content is now a first-class entry type:
- Serialized as {"thinking": "..."} in conversation log
- 0 tokens for budgeting (doesn't count against context window)
- Filtered from assemble_api_messages (not sent back to model)
- Displayed in UI with "thinking: ..." label and expandable content

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 22:46:06 -04:00
Kent Overstreet
e213644514 Fix: only evict scored memories, not unscored
lowest_scored_memory() now skips memories with score=None. Unscored
memories haven't been evaluated — dropping them before scored
low-value ones loses potentially important context.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 21:06:45 -04:00
Kent Overstreet
a20f3e3642 Restore entry labels in context tree: role, tool calls, memory keys
ConversationEntry::label() provides descriptive labels matching the
old entry_sections format:
- "Kent: what about..." / "Aria: [tool_call: memory_search, ...]"
- "mem: [memory: key-name score:0.73]"
- "dmn: [heartbeat]" / "system: [system prompt]"

Uses config names (assistant_name, user_name) not generic "asst"/"user".
Widget renderer uses label() instead of raw content preview.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 21:04:41 -04:00
Kent Overstreet
b892cae2be Simplify trim_entries, kill ContextBudget
trim_entries is now a simple loop:
1. Drop duplicate memories and DMN entries
2. While over budget: if memories > 50% of entry tokens, drop
   lowest-scored memory; otherwise drop oldest conversation entry
3. Snap to user message boundary

ContextBudget is gone — sections already have cached token totals:
- total_tokens() on ContextState replaces budget.total()
- format_budget() on ContextState replaces budget.format()
- trim() takes fixed_tokens: usize (system + identity + journal)

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-07 20:58:06 -04:00
Kent Overstreet
62996e27d7 WIP: ContextEntry/ContextSection data structures for incremental token counting
New types — not yet wired to callers:

- ContextEntry: wraps ConversationEntry with cached token count and
  timestamp
- ContextSection: named group of entries with cached token total.
  Private entries/tokens, read via entries()/tokens().
  Mutation via push(entry), set(index, entry), del(index).
- ContextState: system/identity/journal/conversation sections + working_stack
- ConversationEntry::System variant for system prompt entries

Token counting happens once at push time. Sections maintain their
totals incrementally via push/set/del. No more recomputing from
scratch on every budget check.

Does not compile — callers need updating.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 20:48:08 -04:00
Kent Overstreet
776ac527f1 trim_entries: take ContextBudget instead of recomputing
compact() already computes context_budget() — pass it to trim_entries
so it has access to all budget components without recomputing them.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 19:43:39 -04:00
Kent Overstreet
df62b7ceaa Persist memory scores, use them for eviction in trim_entries
Scores are saved to memory-scores.json alongside the conversation log
after each scoring run, and loaded on startup — no more re-scoring
on restart.

trim_entries now evicts lowest-scored memories first (instead of
oldest-first) when memories exceed 50% of context. The 50% threshold
stays as a heuristic for memory-vs-conversation balance until we have
a scoring signal for conversation entries too. Unscored memories get
0.0, so they're evicted before scored ones.

save_memory_scores rebuilds from current entries, so evicted memories
are automatically expired from the scores file.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 19:39:08 -04:00
Kent Overstreet
cf1c64f936 Split context_state_summary: ContextBudget for compaction, UI-only for display
context_state_summary() was used for both compaction decisions (just
needs token counts) and debug screen display (needs full tree with
labels). Split into:

- Agent::context_budget() -> ContextBudget: cheap token counting by
  category, used by compact(), restore_from_log(), mind event loop
- ContextBudget::format(): replaces sections_budget_string() which
  fragily pattern-matched on section name strings
- context_state_summary(): now UI-only, formatting code stays here

Also extracted entry_sections() as shared helper with include_memories
param — false for context_state_summary (memories have own section),
true for conversation_sections_from() (subconscious screen shows all).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 19:02:58 -04:00
Kent Overstreet
f33b1767da Restrict API types visibility — types module is now private
Only Message, Role, MessageContent, ContentPart, ToolCall,
FunctionCall, Usage, ImageUrl are pub-exported from agent::api.

Internal types (ChatRequest, ChatCompletionChunk, ChunkChoice,
Delta, ReasoningConfig, ToolCallDelta, FunctionCallDelta) are
pub(crate) — invisible outside the crate.

All callers updated to import from agent::api:: instead of
agent::api::types::.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 13:39:20 -04:00
Kent Overstreet
39dcf27bd0 Memory scores on entries, not a separate Vec
ConversationEntry::Memory gains score: Option<f64>. The scorer
writes scores directly onto entries when results arrive. Removes
Agent.memory_scores Vec and the memory_scores parameter from
context_state_summary().

Scores are serialized to/from the conversation log as memory_score.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 03:14:24 -04:00
Kent Overstreet
77b68ecc50 Remove dead SharedContextState type and imports
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 03:05:21 -04:00
Kent Overstreet
b37b6d7495 Kill log callback — use ConversationEntry::Log for debug traces
Add Log variant to ConversationEntry that serializes to the
conversation log but is filtered out on read-back and API calls.
AutoAgent writes debug/status info (turns, tokens, tool calls)
through the conversation log instead of a callback parameter.

Removes the log callback from run_one_agent, call_api_with_tools,
and all callers.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 01:23:22 -04:00
Kent Overstreet
d5e6f55da9 Fix context budgeting and compaction
- Budget now counts exact message tokens matching what assemble_api_messages
  sends, not raw string content. Eliminates undercounting from formatting
  overhead (journal headers, personality separators, working stack).

- Load journal before trimming so trim accounts for journal cost.

- Compact before every turn, not just after turn completion. Prevents
  agent_cycle surfaced memories from pushing context over budget.

- Move agent_cycle orchestration from Agent::turn to Mind::start_turn —
  surfaced memories and reflections now precede the user message.

- Move AgentCycleState from Agent to Mind — it's orchestration, not
  per-agent state. memory_scoring_in_flight and memory_scores stay on
  Agent where they belong.

- Tag DMN entries as ConversationEntry::Dmn — compaction evicts them
  first since they're ephemeral. Compaction also prefers evicting
  memories over conversation when memories exceed 50% of entry tokens.

- Kill /retry slash command.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-06 22:43:55 -04:00
Kent Overstreet
c22b8c3a6f Unify budget and context state — single source of truth
Kill ContextBudget and recompute_budget entirely. Budget percentages,
used token counts, and compaction threshold checks now all derive from
the ContextSection tree built by context_state_summary(). This
eliminates the stale-budget bug where the cached budget diverged from
actual context contents.

Also: remove MindCommand::Turn — user input flows through
shared_mind.input exclusively. Mind::start_turn() atomically moves
text from pending input into the agent's context and spawns the turn.
Kill /retry. Make Agent::turn() take no input parameter.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-06 22:43:55 -04:00
Kent Overstreet
f4664ca06f Cache context budget instead of recomputing every frame
budget() called tiktoken on every UI tick, which was the main CPU hog
during rapid key input. Move the cached ContextBudget onto ContextState
and recompute only when entries actually change (push_entry, compact,
restore_from_log).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-06 22:43:55 -04:00
Kent Overstreet
222b2cbeb2 chat: PartialEq on ConversationEntry for proper diff
Add PartialEq to Message, FunctionCall, ToolCall, ContentPart,
ImageUrl, MessageContent, ConversationEntry. Sync now compares
entries directly instead of content lengths.

Phase 1 pops mismatched tail entries using PartialEq comparison.
Phase 2 pushes new entries with clone into last_entries buffer.

TODO: route_entry needs to handle multiple tool calls per entry.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-05 19:41:16 -04:00
Kent Overstreet
375a8d9738 move working_stack code to correct file 2026-04-04 18:19:21 -04:00
Kent Overstreet
9bebbcb635 Move API code from user/ to agent/
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-04 00:34:48 -04:00
Kent Overstreet
2f0c7ce5c2 src/thought -> src/agent
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-03 22:24:56 -04:00
Kent Overstreet
214806cb90 move context functions from agent/context.rs to thought/context.rs
trim_conversation moved to thought/context.rs where model_context_window,
msg_token_count, is_context_overflow, is_stream_error already lived.
Delete the duplicate agent/context.rs (94 lines).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-02 15:28:00 -04:00
Kent Overstreet
078dcf22d0 cleanup: remove model name string matching
model_context_window() now reads from config.api_context_window
instead of guessing from model name strings. is_anthropic_model()
replaced with backend == "anthropic" checks. Dead model field
removed from AgentDef/AgentHeader.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-02 14:09:54 -04:00
Kent Overstreet
47c6694b10 Remove dead code: old context builder, plan_context, journal parsing
Removed from context.rs: ContextPlan, plan_context,
render_journal_text, assemble_context, truncate_at_section,
find_journal_cutoff, parse_msg_timestamp. All replaced by
trim_conversation + journal from memory graph.

Removed from tui.rs: most_recent_file, format_duration
(filesystem scanning leftovers).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-02 03:40:35 -04:00
Kent Overstreet
e9e47eb798 Replace build_context_window with trim_conversation
build_context_window loaded journal from a stale flat file and
assembled the full context. Now journal comes from the memory graph
and context is assembled on the fly. All that's needed is trimming
the conversation to fit the budget.

trim_conversation accounts for identity, journal, and reserve
tokens, then drops oldest conversation messages until it fits.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-02 03:35:28 -04:00
ProofOfConcept
998b71e52c flatten: move poc-memory contents to workspace root
No more subcrate nesting — src/, agents/, schema/, defaults/, build.rs
all live at the workspace root. poc-daemon remains as the only workspace
member. Crate name (poc-memory) and all imports unchanged.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-03-25 00:54:12 -04:00
Renamed from poc-memory/src/agent/context.rs (Browse further)