Instead of two separate notifications piling up on the status bar,
use a single ActivityGuard that updates in place during overflow
retries and auto-completes when the turn finishes.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Lets long-running operations update their status bar message without
creating/dropping a new activity per iteration. Useful for loops
like memory scoring where you want "scoring: 3/25 keyname" updating
in place.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The priority field existed in agent definitions and was serialized
into vLLM requests, but was never actually set — every request went
out with no priority, so vLLM treated them equally. This meant
background graph maintenance agents could preempt the main
conversation.
Add priority to AgentState and set it at each call site:
0 = interactive (main conversation)
1 = surface agent (needs to feed memories promptly)
2 = other subconscious agents
10 = unconscious/standalone agents (batch)
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- Mouse text selection with highlight rendering in panes
- OSC 52 clipboard copy on selection, middle-click paste via tmux buffer
- Bracketed paste support (Event::Paste)
- yield_to_user: no tool result appended, ends turn immediately
- yield_to_user: no parameters, just a control signal
- Drop arboard dependency, use crossterm OSC 52 + tmux for clipboard
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Uses JsonlBackwardIter (SIMD memrchr3) to scan the conversation log
newest-first without reading/parsing the whole file. Stops as soon
as the conversation budget is full. Only the kept nodes get
retokenized and pushed into context.
18MB log → only tokenize the ~50 nodes that fit in the budget.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
restore_from_log reads the full log but walks backwards from the tail,
retokenizing each node as it goes. Stops when conversation budget is
full. Only the nodes that fit get pushed into context.
Added AstNode::retokenize() — recomputes token_ids on all leaves
after deserialization (serde skip means they're empty).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
New mcp_client.rs: McpRegistry manages MCP server connections.
Spawns child processes, speaks JSON-RPC 2.0 over stdio. Discovers
tools via tools/list, dispatches calls via tools/call.
dispatch_with_agent falls through to MCP after checking internal
tools. McpRegistry lives on Agent (shared across forks).
Still needs: config-driven server startup, system prompt integration.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The unconscious trigger holds the tokio mutex during heavy sync work
(store load, graph build, agent creation), blocking the UI tick which
needs the same lock for snapshots. Fix: try_lock in the UI — skip
the update if the trigger is running.
Also: restore_from_log was re-logging every restored node back to the
log file via push()'s auto-log. Added push_no_log() for restore path.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The log records what goes into context, so it belongs under the context
lock. push() now auto-logs conversation entries, eliminating all the
manual lock-state-for-log, drop, lock-context-for-push dances.
- ContextState: new conversation_log field, Clone impl drops it
(forked contexts don't log)
- push(): auto-logs Section::Conversation entries
- push_node, apply_tool_results, collect_results: all simplified
- collect_results: batch nodes under single context lock
- Assistant response logged under context lock after parse completes
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- push_node: notify before dropping state lock instead of relocking
- Mind::run: single lock for timeout + turn_active + has_input;
single lock for turn_handle + complete_turn
- Agent triggers (subconscious/unconscious) spawned as async tasks
so they don't block the select loop
- has_pending_input() peek for DMN sleep guard — don't sleep when
there's user input waiting
- unconscious: merge collect_results into trigger, single store load
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Subconscious agents (observe, etc.) fork the conscious agent's context
to share the KV cache prefix. When a multi-step agent fills the context
window, compacting blows the KV cache and evicts the step prompts,
leaving the model with no idea what it was doing.
Fix: forked agents set no_compact=true. On overflow, turn() returns the
error immediately (no compact+retry), and run_with_backend catches it
and returns Ok — the output tool has already written results to
Subconscious.state, so collect_results still picks them up.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Ported the old trim_entries logic to the new AstNode types:
- Phase 1: Dedup Memory nodes by key (keep last), drop DMN entries
- Phase 2: While over budget, evict lowest-scored memory (if memories
> 50% of conv tokens) or oldest conversation entry
- Phase 3: Snap to User message boundary at start
Called from compact() which runs on startup and on /compact.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
compact() cleared and rebuilt the system section but only pushed the
system prompt — tool definitions were lost. Since new() sets up the
system section correctly (prompt + tools), compact() now only reloads
identity and journal, leaving system untouched.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Parser skips Thinking nodes that are just whitespace. Conscious screen
now shows assistant children (Content, Thinking, ToolCall) as nested
tree items via recursive node_to_view. Nodes get timestamped in
push_node and on assistant branch creation.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The parser mutates the AST directly but doesn't write to the
conversation log. The turn loop now logs the completed assistant
branch after the parser handle resolves successfully.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
apply_tool_results() collects all results, then does one state lock
(remove from active_tools + write to log) and one context lock (push
all nodes). Eliminates redundant per-result locking.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
New ActiveTools struct with proper methods: push, remove,
take_finished, take_foreground, iter, len. Turn loop uses
helpers instead of manual index iteration.
Removing SharedActiveTools (Arc<Mutex<Vec>>) — active tools
live directly in AgentState. A few UI callers still need
updating.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Made StreamToken pub (was pub(crate), needed by context.rs).
Removed dead API_CLIENT, get_client, sampling/priority fields
from oneshot. Suppressed pre-existing SkipIndex warning in learn.rs.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
ResponseParser::run() spawns a task that reads StreamTokens, parses
into the AST (locking context per token), and sends PendingToolCalls
through a channel. Returns (tool_rx, JoinHandle<Result>) — the turn
loop dispatches tool calls and awaits the handle for error checking.
Token IDs from vLLM are accumulated alongside text and stored directly
on AST leaves — no local re-encoding on the response path.
The turn loop no longer matches on individual stream events. It just
reads tool calls and dispatches them.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Agent is now Arc<Agent> (immutable config). ContextState and AgentState
have separate tokio::sync::Mutex locks. The parser locks only context,
tool dispatch locks only state. No contention between the two.
All callers migrated: mind/, user/, tools/, oneshot, dmn, learn.
28 tests pass, zero errors.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Split Agent into immutable Agent (behind Arc) and mutable AgentState
(behind its own Mutex). ContextState has its own Mutex on Agent.
Activities moved to AgentState. new() and fork() rewritten.
All callers need mechanical updates: agent.lock().await.field →
agent.state.lock().await.field or agent.context.lock().await.method.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Work in progress. New turn loop uses ResponseParser + StreamToken.
Killed StreamEvent, append_streaming, finalize_streaming, streaming_index,
assemble_api_messages, working_stack. Many methods still reference old
types — fixing next.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
New context_new.rs with the AST-based context window design:
- AstNode: role + NodeBody (Leaf with text+token_ids, or Branch with children)
- Tokens only on leaves, branches walk children
- render() produces UTF-8, tokenize produces token IDs, same path
- ResponseParser state machine for streaming assistant responses
- Role enum covers all node types including sections
Still needs: fix remaining pattern match issues, add ContextState wrapper,
wire into mod.rs, replace old context.rs.
Does not compile yet — this is a design checkpoint.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Entries with empty token_ids (Thinking, Log) are not part of the
prompt and don't have messages. Skip them in streaming_index(),
route_entry(), and sync_from_agent() instead of calling .message()
which panics.
Using token_ids.is_empty() as the guard in streaming_index means
the check is tied to the data, not the type — any entry that
doesn't produce tokens is safely skipped.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
New stream_completions() in openai.rs sends prompt as token IDs to
the completions endpoint instead of JSON messages to chat/completions.
Handles <think> tags in the response (split into Reasoning events)
and stops on <|im_end|> token.
start_stream_completions() on ApiClient provides the same interface
as start_stream() but takes token IDs instead of Messages.
The turn loop in Agent::turn() uses completions when the tokenizer
is initialized, falling back to the chat API otherwise. This allows
gradual migration — consciousness uses completions (Qwen tokenizer),
Claude Code hook still uses chat API (Anthropic).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Tool definitions are now pushed as a ContextEntry in the system
section at Agent construction time, formatted in the Qwen chat
template style. They're tokenized, scored, and treated like any
other context entry.
assemble_prompt_tokens() no longer takes a tools parameter —
tools are already in the context. This prepares for the switch
to /v1/completions where tools aren't a separate API field.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Remove tiktoken-rs dependency, CoreBPE field on Agent, and the
msg_token_count() function. All tokenization now goes through the
global HuggingFace tokenizer in agent/tokenizer.rs.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Add agent/tokenizer.rs with global Qwen 3.5 tokenizer that generates
actual token IDs including chat template wrapping. ContextEntry now
stores token_ids: Vec<u32> instead of tokens: usize — the count is
derived from the length.
ContextEntry::new() tokenizes automatically via the global tokenizer.
ContextSection::push_entry() takes a raw ConversationEntry and
tokenizes it. set_message() re-tokenizes without needing an external
tokenizer parameter.
Token IDs include the full chat template: <|im_start|>role\ncontent
<|im_end|>\n — so concatenating token_ids across entries produces a
ready-to-send prompt for vLLM's /v1/completions endpoint.
The old tiktoken CoreBPE is now unused on Agent (will be removed in
a followup). Token counts are now exact for Qwen 3.5 instead of the
~85-90% approximation from cl100k_base.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
restore_from_log called .message() on all entries including Thinking
entries, which panic. Filter them out alongside Log entries.
Also fix bail-no-competing.sh: without nullglob, when no pid-* files
exist the glob stays literal and always triggers a false bail.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
StreamResult now includes accumulated reasoning text. After each
stream completes, if reasoning was produced, a Thinking entry is
pushed to the conversation before the response message.
Reasoning content is visible in the context tree UI but not sent
back to the API and doesn't count against the token budget.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Thinking/reasoning content is now a first-class entry type:
- Serialized as {"thinking": "..."} in conversation log
- 0 tokens for budgeting (doesn't count against context window)
- Filtered from assemble_api_messages (not sent back to model)
- Displayed in UI with "thinking: ..." label and expandable content
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
trim_entries is now a simple loop:
1. Drop duplicate memories and DMN entries
2. While over budget: if memories > 50% of entry tokens, drop
lowest-scored memory; otherwise drop oldest conversation entry
3. Snap to user message boundary
ContextBudget is gone — sections already have cached token totals:
- total_tokens() on ContextState replaces budget.total()
- format_budget() on ContextState replaces budget.format()
- trim() takes fixed_tokens: usize (system + identity + journal)
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
New types — not yet wired to callers:
- ContextEntry: wraps ConversationEntry with cached token count and
timestamp
- ContextSection: named group of entries with cached token total.
Private entries/tokens, read via entries()/tokens().
Mutation via push(entry), set(index, entry), del(index).
- ContextState: system/identity/journal/conversation sections + working_stack
- ConversationEntry::System variant for system prompt entries
Token counting happens once at push time. Sections maintain their
totals incrementally via push/set/del. No more recomputing from
scratch on every budget check.
Does not compile — callers need updating.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
compact() already computes context_budget() — pass it to trim_entries
so it has access to all budget components without recomputing them.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
context_state_summary() was used for both compaction decisions (just
needs token counts) and debug screen display (needs full tree with
labels). Split into:
- Agent::context_budget() -> ContextBudget: cheap token counting by
category, used by compact(), restore_from_log(), mind event loop
- ContextBudget::format(): replaces sections_budget_string() which
fragily pattern-matched on section name strings
- context_state_summary(): now UI-only, formatting code stays here
Also extracted entry_sections() as shared helper with include_memories
param — false for context_state_summary (memories have own section),
true for conversation_sections_from() (subconscious screen shows all).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Add provenance field to Agent, set to "agent:{name}" for forked
subconscious agents. Memory tools (write, link_add, supersede,
journal_new, journal_update) now read provenance from the Agent
context when available, falling back to "manual" for interactive use.
AutoAgent passes the forked agent to dispatch_with_agent so tools
can access it.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Only Message, Role, MessageContent, ContentPart, ToolCall,
FunctionCall, Usage, ImageUrl are pub-exported from agent::api.
Internal types (ChatRequest, ChatCompletionChunk, ChunkChoice,
Delta, ReasoningConfig, ToolCallDelta, FunctionCallDelta) are
pub(crate) — invisible outside the crate.
All callers updated to import from agent::api:: instead of
agent::api::types::.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
ConversationEntry::Memory gains score: Option<f64>. The scorer
writes scores directly onto entries when results arrive. Removes
Agent.memory_scores Vec and the memory_scores parameter from
context_state_summary().
Scores are serialized to/from the conversation log as memory_score.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
F1 and F2 screens now call agent.context_state_summary() directly
via try_lock/lock instead of reading from a shared RwLock cache.
Removes SharedContextState, publish_context_state(), and
publish_context_state_with_scores().
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Add Log variant to ConversationEntry that serializes to the
conversation log but is filtered out on read-back and API calls.
AutoAgent writes debug/status info (turns, tokens, tool calls)
through the conversation log instead of a callback parameter.
Removes the log callback from run_one_agent, call_api_with_tools,
and all callers.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Agent::fork() clones context for KV cache sharing with conscious agent.
AutoAgent runs multi-step prompt sequences with tool dispatch — used by
both oneshot CLI agents and (soon) Mind's subconscious agents.
call_api_with_tools() now delegates to AutoAgent internally; existing
callers unchanged.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>