Qwen's chat template renders tool results as:
<|im_start|>user\n<tool_response>\n{content}\n</tool_response><|im_end|>
We were rendering as:
<|im_start|>tool\n{content}<|im_end|>
The model never saw <|im_start|>tool in training, so it ignored our
tool results and looped retrying the same call. Found by comparing
our tokenization against vLLM's /tokenize endpoint with chat messages.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
compact() was clearing tool definitions from the system section on
startup — now leaves system section untouched (set once by new()).
Added context token count to parser done log for diagnosing the
subconscious agent loop issue.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
compact() cleared and rebuilt the system section but only pushed the
system prompt — tool definitions were lost. Since new() sets up the
system section correctly (prompt + tools), compact() now only reloads
identity and journal, leaving system untouched.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Logs full response text when no tool calls detected, tool call
bodies when found. Per-agent log files for debugging subconscious
agent parsing issues.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Logs full text length, <tool_call> tag count, and tool call details
on stream completion. Helps diagnose parsing issues with subconscious
agents.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Was checking trim but storing untrimmed. Now stores the trimmed
version — no leading/trailing whitespace in the AST.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Content between tags (e.g. newlines between </think> and <tool_call>)
was creating empty Content nodes. Now trimmed before creating the node.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Parser skips Thinking nodes that are just whitespace. Conscious screen
now shows assistant children (Content, Thinking, ToolCall) as nested
tree items via recursive node_to_view. Nodes get timestamped in
push_node and on assistant branch creation.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The parser can't reliably split model-produced token IDs at tag
boundaries (<think>, <tool_call>) because BPE tokens can span across
tags. Instead, each leaf gets re-encoded from its text content via
the local tokenizer. This gives clean token boundaries aligned with
semantic structure — better for budgeting and potentially for the
model during fine-tuning.
Also skip serializing token_ids to conversation log (they're cached
state, recomputed on construction).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The parser mutates the AST directly but doesn't write to the
conversation log. The turn loop now logs the completed assistant
branch after the parser handle resolves successfully.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
sync_from_agent now detects changed entries by comparing token counts
(cheap proxy for content changes during streaming). Changed entries
get popped and re-pushed. Extracted push_routed/pop_routed helpers.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
apply_tool_results() collects all results, then does one state lock
(remove from active_tools + write to log) and one context lock (push
all nodes). Eliminates redundant per-result locking.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
New ActiveTools struct with proper methods: push, remove, abort_all,
take_finished, take_foreground, iter, len. Lives directly on AgentState,
no separate Arc<Mutex> needed.
TUI reads active tools through agent.state.try_lock(). Turn loop uses
helpers instead of manual index iteration.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
New ActiveTools struct with proper methods: push, remove,
take_finished, take_foreground, iter, len. Turn loop uses
helpers instead of manual index iteration.
Removing SharedActiveTools (Arc<Mutex<Vec>>) — active tools
live directly in AgentState. A few UI callers still need
updating.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Made StreamToken pub (was pub(crate), needed by context.rs).
Removed dead API_CLIENT, get_client, sampling/priority fields
from oneshot. Suppressed pre-existing SkipIndex warning in learn.rs.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
ResponseParser::run() spawns a task that reads StreamTokens, parses
into the AST (locking context per token), and sends PendingToolCalls
through a channel. Returns (tool_rx, JoinHandle<Result>) — the turn
loop dispatches tool calls and awaits the handle for error checking.
Token IDs from vLLM are accumulated alongside text and stored directly
on AST leaves — no local re-encoding on the response path.
The turn loop no longer matches on individual stream events. It just
reads tool calls and dispatches them.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Agent is now Arc<Agent> (immutable config). ContextState and AgentState
have separate tokio::sync::Mutex locks. The parser locks only context,
tool dispatch locks only state. No contention between the two.
All callers migrated: mind/, user/, tools/, oneshot, dmn, learn.
28 tests pass, zero errors.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Bulk replaced Arc<Mutex<Agent>> with Arc<Agent> across all files.
Fixed control.rs, memory.rs tool handlers. Fixed oneshot Backend.
Remaining errors are all agent.lock() → agent.state.lock() or
agent.context.lock() in mind/, user/, and a few in mod.rs.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Split Agent into immutable Agent (behind Arc) and mutable AgentState
(behind its own Mutex). ContextState has its own Mutex on Agent.
Activities moved to AgentState. new() and fork() rewritten.
All callers need mechanical updates: agent.lock().await.field →
agent.state.lock().await.field or agent.context.lock().await.method.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
API is now two files: mod.rs (430 lines) and http.rs. Contains:
Usage, StreamToken, SamplingParams, ApiClient, stream_completions,
SseReader, send_and_check. Everything else is dead and gone.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Removed all chat completions wire types that are no longer used:
ChatRequest, ReasoningConfig, ChatCompletionChunk, ChunkChoice,
Delta, FunctionCallDelta, ToolCallDelta, append_content, user_with_images.
Remaining types in api/types.rs are transitional (Message, ToolCall, etc.)
— they'll go away as outer callers migrate to AstNode.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Deleted: api/parsing.rs entirely (parsing now in context_new.rs),
stream_events (chat completions path), collect_stream, build_response_message,
log_diagnostics, tools_to_json_str, start_stream, chat_completion_stream_temp.
API layer is now just: stream_completion (token IDs in/out), SseReader,
send_and_check, and types. Zero errors in api/.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Work in progress. New turn loop uses ResponseParser + StreamToken.
Killed StreamEvent, append_streaming, finalize_streaming, streaming_index,
assemble_api_messages, working_stack. Many methods still reference old
types — fixing next.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The parser takes &mut ContextState on feed()/finish() and pushes
completed children (content, thinking, tool calls) directly into
the assistant branch. Only PendingToolCall handles are returned
to the caller for dispatch — the caller no longer manages AST
mutation.
Tests verify by reading back from ContextState after parsing.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
feed() now returns all completed children (not just tool calls) so the
caller can push them into the AST as they arrive. finish() returns
remaining buffered children. The caller manages the assistant branch.
Added ContextState::push_child() for appending to an existing branch,
PendingToolCall for ephemeral dispatch handles, and len() for section
size queries.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Prep for wiring context_new.rs into the codebase: AstNode, NodeLeaf,
NodeBody, Role all derive Serialize/Deserialize for conversation log
persistence.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
render_into(&mut String) and token_ids_into(&mut Vec<u32>) recurse
the tree extending the output in place. Branches emit their wrapping
(im_start/role/im_end) and recurse into children — same structure in
both methods. token_ids() now composes from cached leaf tokens instead
of re-encoding the full rendered string.
Killed the AstEvent/AstIter iterator experiment — explicit recursion
is cleaner for a tree walk that isn't truly flattening.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Implemented by both AstNode and ContextState, so anything that
needs "give me the prompt" can take impl Ast.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Role is now just System/User/Assistant — maps 1:1 to the grammar.
Leaf types are NodeBody variants: Content, Thinking, ToolCall,
ToolResult, Memory, Dmn, Log. Each variant renders itself; no Role
needed on leaves. AstNode is Leaf(NodeLeaf) | Branch{role, children}.
ContextState holds four Vec<AstNode> sections directly.
Moved tool call XML parsing from api/parsing.rs into context_new.rs
so all grammar knowledge lives in one place.
Tokenizer encode() now returns empty vec when uninitialized instead
of panicking, so tests work without the tokenizer file.
26 tests: XML parsing, incremental streaming (char-by-char feeds
found and fixed a lookahead bug), rendering for all node types,
tokenizer round-trip verification.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
AstNode fields are now private with read-only accessors. All mutation
goes through ContextState methods (push, set_message, set_score, del)
which guarantee token_ids stays in sync with text on every leaf.
Also fix ResponseParser to use AstNode::tool_call() constructor,
widen parsing module visibility to pub(crate).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
New context_new.rs with the AST-based context window design:
- AstNode: role + NodeBody (Leaf with text+token_ids, or Branch with children)
- Tokens only on leaves, branches walk children
- render() produces UTF-8, tokenize produces token IDs, same path
- ResponseParser state machine for streaming assistant responses
- Role enum covers all node types including sections
Still needs: fix remaining pattern match issues, add ContextState wrapper,
wire into mod.rs, replace old context.rs.
Does not compile yet — this is a design checkpoint.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Entries with empty token_ids (Thinking, Log) are not part of the
prompt and don't have messages. Skip them in streaming_index(),
route_entry(), and sync_from_agent() instead of calling .message()
which panics.
Using token_ids.is_empty() as the guard in streaming_index means
the check is tied to the data, not the type — any entry that
doesn't produce tokens is safely skipped.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The consciousness binary has its own main() separate from poc-memory.
Agent::new() creates ContextEntries which need the tokenizer, so it
must be initialized before Mind::new().
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
New stream_completions() in openai.rs sends prompt as token IDs to
the completions endpoint instead of JSON messages to chat/completions.
Handles <think> tags in the response (split into Reasoning events)
and stops on <|im_end|> token.
start_stream_completions() on ApiClient provides the same interface
as start_stream() but takes token IDs instead of Messages.
The turn loop in Agent::turn() uses completions when the tokenizer
is initialized, falling back to the chat API otherwise. This allows
gradual migration — consciousness uses completions (Qwen tokenizer),
Claude Code hook still uses chat API (Anthropic).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Tool definitions are now pushed as a ContextEntry in the system
section at Agent construction time, formatted in the Qwen chat
template style. They're tokenized, scored, and treated like any
other context entry.
assemble_prompt_tokens() no longer takes a tools parameter —
tools are already in the context. This prepares for the switch
to /v1/completions where tools aren't a separate API field.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Remove tiktoken-rs dependency, CoreBPE field on Agent, and the
msg_token_count() function. All tokenization now goes through the
global HuggingFace tokenizer in agent/tokenizer.rs.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Add agent/tokenizer.rs with global Qwen 3.5 tokenizer that generates
actual token IDs including chat template wrapping. ContextEntry now
stores token_ids: Vec<u32> instead of tokens: usize — the count is
derived from the length.
ContextEntry::new() tokenizes automatically via the global tokenizer.
ContextSection::push_entry() takes a raw ConversationEntry and
tokenizes it. set_message() re-tokenizes without needing an external
tokenizer parameter.
Token IDs include the full chat template: <|im_start|>role\ncontent
<|im_end|>\n — so concatenating token_ids across entries produces a
ready-to-send prompt for vLLM's /v1/completions endpoint.
The old tiktoken CoreBPE is now unused on Agent (will be removed in
a followup). Token counts are now exact for Qwen 3.5 instead of the
~85-90% approximation from cl100k_base.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
restore_from_log called .message() on all entries including Thinking
entries, which panic. Filter them out alongside Log entries.
Also fix bail-no-competing.sh: without nullglob, when no pid-* files
exist the glob stays literal and always triggers a false bail.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The bail-no-competing.sh script expects $1 to be the path to the
current agent's pid file so it can skip it when checking for
competing processes. But the runner wasn't passing any arguments,
so $1 was empty and the script treated every pid file (including
the agent's own) as a competing process — bailing every time.
This caused surface-observe to always bail at step 2, preventing
all memory graph maintenance (organize, observe phases) from
running.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The reaper checks if agent PIDs are alive via kill(pid, 0), but if
the PID was reused by an unrelated process, the check succeeds and
the stale pid file blocks the agent from re-launching indefinitely.
Fix: read /proc/pid/cmdline and verify the process is actually a
claude/poc-memory process. If not, remove the pid file.
This caused memory surfacing to stop working for the entire April 7
session — a dead surface-observe process's PID was reused, blocking
all subsequent surfacing attempts with "already running".
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
StreamResult now includes accumulated reasoning text. After each
stream completes, if reasoning was produced, a Thinking entry is
pushed to the conversation before the response message.
Reasoning content is visible in the context tree UI but not sent
back to the API and doesn't count against the token budget.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>