ContextState now owns everything in the context window:
system_prompt, personality, journal, working_stack, loaded_nodes,
and conversation messages. No duplication — each piece exists once
in its typed form.
assemble_api_messages() renders the full message list on the fly
from typed sources. measure_budget() counts each bucket from its
source directly. push_context() removed — identity/journal are
never pushed as messages.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Identity tokens from system_prompt + personality vec. Journal
from journal entries vec. Memory from loaded_nodes. Conversation
is the remainder. No string prefix matching.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Count journal tokens directly from Vec<JournalEntry> instead of
scanning message text for prefix strings. Type system, not string
typing.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Keep journal entries as structured data in ContextState. Render
to text only when building the context message. Debug screen reads
the structured entries directly — no parsing ## headers back out.
Compaction paths temporarily parse the string from build_context_window
back to entries (to be cleaned up when compaction is reworked).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Render journal entries directly with ## headers instead of going
through the plan_context/render_journal_text pipeline. 5% of
model context window (~6500 tokens for Qwen 128K). Simpler and
predictable.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Iterate journal entries backwards from the conversation cutoff,
accumulating within ~10K token budget (~8% of context window).
Stops when budget is full, keeps at least one entry. Much more
efficient than loading all entries and trimming.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Replace flat-file journal parser with direct store query for
EpisodicSession nodes. Filter journal entries to only those older
than the oldest conversation message (plus one overlap entry to
avoid gaps). Falls back to 20 recent entries when no conversation
exists yet.
Fixes: poc-agent context window showing 0 journal entries.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
spawn_agent returns Child handle + log_path. AgentCycleState stores
the Child, polls with try_wait() on each trigger to detect completion.
No more filesystem scanning to track agent lifecycle.
AgentSnapshot (Clone) sent to TUI for display. AgentInfo holds the
Child handle and stays in the state.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Runner owns AgentCycleState, calls trigger() on each user message
instead of the old run_hook() JSON round-trip. Sends AgentUpdate
messages to TUI after each cycle.
TUI F2 screen reads agent state from messages instead of scanning
the filesystem on every frame. HookSession::from_fields() lets
poc-agent construct sessions without JSON serialization.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- Remove POC_AGENT early return (was from old claude -p era)
- Split hook into run_agent_cycles() -> AgentCycleOutput (returns
memory keys + reflection) and format_agent_output() (renders for
Claude Code injection). poc-agent can call run_agent_cycles
directly and handle output its own way.
- Fix UTF-8 panic in runner.rs display_buf slicing (floor_char_boundary)
- Add priority debug label to API requests
- Wire up F2 agents screen: live pid status, output files, hook log
tail, arrow key navigation, Enter for log detail view
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Thread request priority through the API call chain to vLLM's
priority scheduler. Lower value = higher priority, with preemption.
Priority is set per-agent in the .agent header:
- interactive (runner): 0 (default, highest)
- surface-observe: 1 (near-realtime, watches conversation)
- all other agents: 10 (batch, default if not specified)
Requires vLLM started with --scheduling-policy priority.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Split the streaming pipeline: API backends yield StreamEvents through
a channel, the runner reads them and routes to the appropriate UI pane.
- Add StreamEvent enum (Content, Reasoning, ToolCallDelta, etc.)
- API start_stream() spawns backend as a task, returns event receiver
- Runner loops over events, sends content to conversation pane but
suppresses <tool_call> XML with a buffered tail for partial tags
- OpenAI backend refactored to stream_events() — no more UI coupling
- Anthropic backend gets a wrapper that synthesizes events from the
existing stream() (TODO: native event streaming)
- chat_completion_stream() kept for subconscious agents, reimplemented
on top of the event stream
- Usage derives Clone
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Tool call parsing was only in runner.rs, so subconscious agents
(poc-memory agent run) never recovered leaked tool calls from
models that emit <tool_call> as content text (e.g. Qwen via Crane).
Move the recovery into build_response_message where both code paths
share it. Leaked tool calls are promoted to structured tool_calls
and the content is cleaned, so all consumers see them uniformly.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Personality is identity, not memory. Memory is nodes loaded during
the session via tool calls — things I've actively looked at.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
mem% was always 0 because memory_tokens was hardcoded to 0. Now
counts personality context + loaded nodes from memory tool calls.
Also calls measure_budget + publish_context_state after memory tool
dispatch so the debug screen updates immediately.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
MemoryNode moved from agent/memory.rs to hippocampus/memory.rs — it's
a view over hippocampus data, not agent-specific.
Store operations (set_weight, set_link_strength, add_link) moved into
store/ops.rs. CLI code (cli/graph.rs, cli/node.rs) and agent tools
both call the same store methods now. render_node() delegates to
MemoryNode::from_store().render() — 3 lines instead of 40.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Memory tools now dispatch through a special path in the runner (like
working_stack) instead of the generic tools::dispatch. This gives them
&mut self access to track loaded nodes:
- memory_render/memory_links: loads MemoryNode, registers in
context.loaded_nodes (replace if already tracked)
- memory_write: refreshes existing tracked node if present
- All other memory tools: dispatch directly, no tracking needed
The debug screen (context_state_summary) now shows a "Memory nodes"
section listing all loaded nodes with version, weight, and link count.
This is the agent knowing what it's holding — the foundation for
intelligent refresh and eviction.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The agent was shelling out to poc-hook which shells out to memory-search.
Now that everything is one crate, just call the library function. Removes
subprocess overhead on every user message.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
No more subcrate nesting — src/, agents/, schema/, defaults/, build.rs
all live at the workspace root. poc-daemon remains as the only workspace
member. Crate name (poc-memory) and all imports unchanged.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-03-25 00:54:12 -04:00
Renamed from poc-memory/src/agent/runner.rs (Browse further)