age_out_images now keeps 1 existing image + 1 about to be added
= 2 live images for motion/comparison. Previously aged all to 1.
Reduces image bloat in conversation log and context.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Large tool results (memory renders, bash output) consume most of
the 2MB budget — only 37 entries loaded from a 527-line log.
8MB captures ~300 entries, giving compact() enough conversation
to work with.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Context was too aggressively trimmed — 80% free after compaction.
Budget was 60% of window minus 25% reserve = only 45% usable.
Now: 80% of window for total budget (20% output reserve built in),
no extra reserve subtraction. Journal budget 5% → 15% to carry
more context across compactions.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Check HTTP status from logprobs API (was silently ignoring 500s).
Call publish_context_state() after storing scores so F10 screen
updates. Add chunk size logging for OOM debugging.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Status bar shows "scoring 3/7..." during scoring. Debug pane logs
per-memory importance and top-5 response breakdowns. F10 context
screen shows which memories were important for each assistant
response as drilldown children (← memory_key (score)).
Added important_memories_for_entry() to look up the matrix by
conversation entry index.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
/score snapshots the context and client, releases the agent lock,
runs scoring in background. Only one score task at a time
(scoring_in_flight flag). Results stored on Agent and shown on
the F10 context debug screen with importance scores per memory.
ApiClient derives Clone. ContextState derives Clone.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
score_memories() drops each memory from the context one at a time,
runs prompt_logprobs against the full conversation, and builds a
divergence matrix: memories × responses.
Row sums = memory importance (for graph weight updates)
Column sums = response memory-dependence (training candidates)
Uses vLLM's prompt_logprobs to check "would the model have said
this without this memory?" — one forward pass per memory, all
responses scored at once. ~3s per memory on B200.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
User and assistant names now come from config.user_name and
config.assistant_name throughout: system prompt, DMN prompts,
debug screen, and all agent files. Agent templates use
{user_name} and {assistant_name} placeholders.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Show chunks received, SSE lines parsed, and the contents of
the line buffer (up to 500 bytes) on both stream errors and
timeouts. This tells us whether we got partial data, a non-SSE
response, or truly nothing from the server.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Stream chunk timeout is now api_stream_timeout_secs in config
(default 60s). Status bar shows total turn time and per-call
time with timeout: "thinking... 45s, 12/60s".
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Spawned streaming tasks were never cancelled when a turn ended or
retried, leaving zombie tasks blocked on dead vLLM connections.
AbortOnDrop wrapper aborts the task when it goes out of scope.
Chunk timeout reduced from 120s to 60s.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
"thinking..." was getting stuck in the status bar when a turn
ended with a stream error, context overflow, or model error —
only the success path cleared it. Now all error returns clear
the activity indicator.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
pub → pub(crate) for SseReader methods (used across child modules).
pub → pub(super) for openai::stream_events, tool definitions, store
helpers. pub → private for normalize_link and differentiate_hub_with_graph
(only used within their own files).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Journal entries are written to the memory graph via journal_new/
journal_update, not appended to a flat file. Remove thought/journal.rs
(67 lines), strip_ephemeral_tool_calls (55 lines), default_journal_path,
and all wiring. -141 lines.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Journal entries are loaded from the memory graph store, not from the
flat journal file. Remove build_context_window, plan_context,
render_journal_text, assemble_context, truncate_at_section,
find_journal_cutoff, parse_journal*, ContextPlan, and stale TODOs.
Keep JournalEntry, default_journal_path (write path), and the live
context management functions. -363 lines.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
trim_conversation moved to thought/context.rs where model_context_window,
msg_token_count, is_context_overflow, is_stream_error already lived.
Delete the duplicate agent/context.rs (94 lines).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
JournalEntry, parse_journal, parse_journal_text, parse_header_timestamp,
and default_journal_path consolidated into thought/context.rs. Delete
the duplicate agent/journal.rs (235 lines). Update all references.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Serialize request JSON before send_and_check so it's available
for both HTTP errors and stream errors. Extracted save logic
into save_failed_request helper on SseReader.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Memory tool results (memory_render) are now pushed as
ConversationEntry::Memory with the node key, instead of plain
Messages. Remove loaded_nodes from ContextState — the debug
screen reads memory info from Memory entries in the conversation.
Surfaced memories from surface-observe are pushed as separate
Memory entries, reflections as separate system-reminder messages.
User input is no longer polluted with hook output.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Log ConversationEntry (with Memory/Message typing) instead of
raw Message. restore_from_log reads typed entries directly,
preserving Memory vs Message distinction across restarts.
Remove current.json snapshot and save_session — the append-only
log is the single source of truth. Remove dead read_all and
message_count methods. Add push_entry for logging typed entries.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Delete anthropic.rs (713 lines) — we only use OpenAI-compatible
endpoints (vLLM, OpenRouter). Simplify ApiClient to store base_url
directly instead of Backend enum.
SseReader now stores the serialized request payload and saves it
to ~/.consciousness/logs/failed-request-{ts}.json on stream timeout,
so failed requests can be replayed with curl for debugging.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
model_context_window() now reads from config.api_context_window
instead of guessing from model name strings. is_anthropic_model()
replaced with backend == "anthropic" checks. Dead model field
removed from AgentDef/AgentHeader.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
build_context_window loaded journal from a stale flat file and
assembled the full context. Now journal comes from the memory graph
and context is assembled on the fly. All that's needed is trimming
the conversation to fit the budget.
trim_conversation accounts for identity, journal, and reserve
tokens, then drops oldest conversation messages until it fits.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The restore and compaction paths called build_context_window which
reads from the stale flat journal file, overwriting the journal we
loaded from the memory graph. Preserve the graph-loaded journal
across these operations.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Replace untyped message list with ConversationEntry enum:
- Message(Message) — regular conversation turn
- Memory { key, message } — memory content with preserved message
for KV cache round-tripping
Budget counts memory vs conversation by matching on enum variant.
Debug screen labels memory entries with [memory: key]. No heuristic
tool-name scanning.
Custom serde: Memory serializes with a memory_key field alongside
the message fields, deserializes by checking for the field.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Remove cached context_budget field and measure_budget(). Budget
is computed on demand via budget() which calls
ContextState::budget(). Each bucket counted from its typed source.
Memory split from conversation by identifying memory tool calls.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
context.messages is conversation-only now — remove conv_start
scanning. Memory counted from loaded_nodes (same as debug screen).
No subtraction heuristics.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
refresh_context_message was injecting personality into conversation
messages (assuming fixed positions that no longer exist). Replaced
with refresh_context_state which just re-measures and publishes.
conv_tokens now subtracts mem_tokens since memory tool results are
in the conversation message list.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
ContextState now owns everything in the context window:
system_prompt, personality, journal, working_stack, loaded_nodes,
and conversation messages. No duplication — each piece exists once
in its typed form.
assemble_api_messages() renders the full message list on the fly
from typed sources. measure_budget() counts each bucket from its
source directly. push_context() removed — identity/journal are
never pushed as messages.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Identity tokens from system_prompt + personality vec. Journal
from journal entries vec. Memory from loaded_nodes. Conversation
is the remainder. No string prefix matching.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Count journal tokens directly from Vec<JournalEntry> instead of
scanning message text for prefix strings. Type system, not string
typing.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Keep journal entries as structured data in ContextState. Render
to text only when building the context message. Debug screen reads
the structured entries directly — no parsing ## headers back out.
Compaction paths temporarily parse the string from build_context_window
back to entries (to be cleaned up when compaction is reworked).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Render journal entries directly with ## headers instead of going
through the plan_context/render_journal_text pipeline. 5% of
model context window (~6500 tokens for Qwen 128K). Simpler and
predictable.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Iterate journal entries backwards from the conversation cutoff,
accumulating within ~10K token budget (~8% of context window).
Stops when budget is full, keeps at least one entry. Much more
efficient than loading all entries and trimming.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Replace flat-file journal parser with direct store query for
EpisodicSession nodes. Filter journal entries to only those older
than the oldest conversation message (plus one overlap entry to
avoid gaps). Falls back to 20 recent entries when no conversation
exists yet.
Fixes: poc-agent context window showing 0 journal entries.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
spawn_agent returns Child handle + log_path. AgentCycleState stores
the Child, polls with try_wait() on each trigger to detect completion.
No more filesystem scanning to track agent lifecycle.
AgentSnapshot (Clone) sent to TUI for display. AgentInfo holds the
Child handle and stays in the state.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
spawn_agent() now returns SpawnResult { pid, log_path } so the
log path is known at spawn time. No more filesystem scanning.
AgentInfo carries log_path, TUI reads it directly.
F2 → Enter shows the actual agent log (stdout/stderr from the
poc-memory agent process), not the hook orchestration log.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Runner owns AgentCycleState, calls trigger() on each user message
instead of the old run_hook() JSON round-trip. Sends AgentUpdate
messages to TUI after each cycle.
TUI F2 screen reads agent state from messages instead of scanning
the filesystem on every frame. HookSession::from_fields() lets
poc-agent construct sessions without JSON serialization.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- Remove POC_AGENT early return (was from old claude -p era)
- Split hook into run_agent_cycles() -> AgentCycleOutput (returns
memory keys + reflection) and format_agent_output() (renders for
Claude Code injection). poc-agent can call run_agent_cycles
directly and handle output its own way.
- Fix UTF-8 panic in runner.rs display_buf slicing (floor_char_boundary)
- Add priority debug label to API requests
- Wire up F2 agents screen: live pid status, output files, hook log
tail, arrow key navigation, Enter for log detail view
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Thread request priority through the API call chain to vLLM's
priority scheduler. Lower value = higher priority, with preemption.
Priority is set per-agent in the .agent header:
- interactive (runner): 0 (default, highest)
- surface-observe: 1 (near-realtime, watches conversation)
- all other agents: 10 (batch, default if not specified)
Requires vLLM started with --scheduling-policy priority.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Split the streaming pipeline: API backends yield StreamEvents through
a channel, the runner reads them and routes to the appropriate UI pane.
- Add StreamEvent enum (Content, Reasoning, ToolCallDelta, etc.)
- API start_stream() spawns backend as a task, returns event receiver
- Runner loops over events, sends content to conversation pane but
suppresses <tool_call> XML with a buffered tail for partial tags
- OpenAI backend refactored to stream_events() — no more UI coupling
- Anthropic backend gets a wrapper that synthesizes events from the
existing stream() (TODO: native event streaming)
- chat_completion_stream() kept for subconscious agents, reimplemented
on top of the event stream
- Usage derives Clone
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Tool call parsing was only in runner.rs, so subconscious agents
(poc-memory agent run) never recovered leaked tool calls from
models that emit <tool_call> as content text (e.g. Qwen via Crane).
Move the recovery into build_response_message where both code paths
share it. Leaked tool calls are promoted to structured tool_calls
and the content is cleaned, so all consumers see them uniformly.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
All log output was scattered across ~/.consciousness/memory/ (daemon,
task logs, LLM call logs), ~/.consciousness/agent-sessions/ (observe),
and only hook logs were already in the right place.
Move everything to ~/.consciousness/logs/ with agent-specific subdirs:
- daemon.log, daemon/ (task logs)
- {agent_name}/ (knowledge agent logs, e.g. surface-observe/, reflect/)
- llm/{caller}/ (LLM call logs)
- observe.log (poc-agent observe)
- hook-{session_id} (already correct)
- debug.log (already correct)
Also includes the session.rs and hook.rs fixes from the previous
session (sessions dir → ~/.consciousness/sessions/).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>