No more subcrate nesting — src/, agents/, schema/, defaults/, build.rs
all live at the workspace root. poc-daemon remains as the only workspace
member. Crate name (poc-memory) and all imports unchanged.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Eliminates the circular dependency between poc-agent and poc-memory by
moving all poc-agent source into poc-memory/src/agent/. The poc-agent
binary now builds from poc-memory/src/bin/poc-agent.rs using library
imports. All poc_agent:: references updated to crate::agent::.
poc-agent/ directory kept for now (removed from workspace members).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Make Session::from_env() and Session::seen() the public API for
accessing session state. Internal callers converted to use session
methods. Search automatically filters already-surfaced nodes when
POC_SESSION_ID is set.
Extract response after '=== RESPONSE ===' marker before parsing
for REFLECTION/NEW RELEVANT MEMORIES. The agent runner dumps the
full log (turns, think blocks) to stdout.
`memory-search surface` and `memory-search reflect` run the agent
directly, parse the output, and dump rendered results to stdout.
Useful for testing with `watch memory-search reflect`.
Thread temperature parameter from agent def header through the API
call chain. Agents can now specify {"temperature": 1.2} in their
JSON header to override the default 0.6.
Also includes Kent's reflect agent prompt iterations.
Add reflect.agent — a lateral-thinking subconscious agent that
observes the conversation and offers occasional reflections when the
conscious mind seems to be missing something.
Refactor memory_search.rs: extract generic agent_cycle_raw() from
the surface-specific code. PID tracking, timeout, spawn/reap logic
is now shared. Surface and reflect agents each have their own result
handler (handle_surface_result, handle_reflect_result) wired through
the common lifecycle.
Add `agent: bool` field to ContextGroup (default true) so agents get
personality/identity context without session-specific groups (journal,
where-am-i). Agents now get the full identity.md, reflections.md,
toolkit, etc. instead of the compact core-personality loader.
New {{agent-context}} placeholder resolves all agent-tagged groups
using the same get_group_content() as load-context.
The CLI render command was marking keys as seen in the user's session
whenever POC_SESSION_ID was set. Agent processes inherit POC_SESSION_ID
(they need to read the conversation and seen set), so their tool calls
to poc-memory render were writing to the seen file as a side effect —
bypassing the dedup logic in surface_agent_cycle.
Fix: set POC_AGENT=1 at the start of cmd_run_agent (covers all agents,
not just surface), and guard the CLI render seen-marking on POC_AGENT
being absent. Agents can read the seen set but only surface_agent_cycle
should write to it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Standalone binary that exercises TailMessages on a transcript file,
reporting progress and timing. Useful for isolating conversation
resolution issues from the full hook pipeline.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move the hook logic from the memory-search binary into a library module
(poc_memory::memory_search) so poc-hook can call it as a direct function
instead of spawning a subprocess with piped stdin.
Also convert the node render call in surface_agent_cycle from
Command::new("poc-memory render") to a direct crate::cli::node::render_node()
call, eliminating another subprocess.
The memory-search binary remains as a thin CLI wrapper for debugging
(--hook reads from stdin) and inspection (show_seen).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add surface_hooks config field — list of hook event names that trigger
the surface agent (e.g. ["UserPromptSubmit"]). Empty list disables it.
Reduce surface agent search from 3-5 hops to 2-3 to keep prompt size
under the API endpoint's connection limit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The placeholder resolver re-scanned from the beginning of the string
after each expansion. If expanded node content contained {{...}}
patterns (which core-personality does), those got expanded recursively.
Cyclic node references caused infinite string growth.
Fix: track a position offset that advances past each substitution,
so expanded content is never re-scanned for placeholders.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The backward JSON scanner (JsonlBackwardIter and TailMessages) was
matching } characters inside JSON strings — code blocks full of Rust
braces being the primary offender. This caused:
- Quadratic retry behavior on code-heavy transcripts (wrong object
boundaries → serde parse failure → retry from different position)
- Inconsistent find_last_compaction_in_file offsets across calls,
making detect_new_compaction fire repeatedly → context reload on
every hook call → seen set growing without bound
Fix: add string-boundary tracking with escaped-quote handling to
the close-brace finder loop, matching the existing logic in the
depth-tracking loop.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove MEMORY_FILES constant from identity.rs
- Add ContextGroup struct for deserializing from config
- Load context_groups from ~/.config/poc-agent/config.json5
- Check ~/.config/poc-agent/ first for identity files, then project/global
- Debug screen now shows what's actually configured
This eliminates the hardcoded duplication and makes the debug output
match what's in the config file.
The surface agent result consumer in poc-hook was writing to the seen
file but not the returned file, so surfaced keys showed up as
"context-loaded" in memory-search --seen.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With limit:10, all seeds' neighborhoods got concatenated into one
massive prompt (878KB+), exceeding the model's context. One seed
at a time keeps prompts well under budget.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of merging both into one flat list, display them as distinct
sections so it's clear what was surfaced in this context vs what
came from before compaction.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two separate placeholders give the agent structural clarity about
which memories are already in context vs which were surfaced before
compaction and may need re-surfacing. Also adds memory_ratio
placeholder so the agent can self-regulate based on how much of
context is already recalled memories.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Budget of 20 roots split between current and prev. Current gets
priority, prev fills the remainder. Prevents flooding the agent
with hundreds of previously surfaced keys.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Present the two seen sets separately to the surface agent:
- Current: already in context, don't re-surface
- Pre-compaction: context was reset, re-surface if still relevant
This lets the agent re-inject important memories after compaction
instead of treating everything ever surfaced as "already shown."
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract surface_agent_cycle() and call from both hooks. Enables
memory surfacing during autonomous work (tool calls without human
prompts). Rate limiting via PID file prevents overlap.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Track which nodes have already been included and skip duplicates.
High-degree seed nodes with overlapping neighborhoods were pulling
the same big nodes dozens of times, inflating prompts to 878KB.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With core-personality + instructions + subconscious-notes adding
~200KB on top of the neighborhood, the 600KB budget pushed total
prompts over the 800KB guard. Lowered to 400KB so full prompts
stay under the limit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mark_seen now takes the in-memory HashSet and checks before appending.
Prevents the same key being written 30+ times from repeated search hits
and context reloads.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move core-personality and conversation to the end of the prompt.
The model needs to see its task before 200KB of conversation
context. Also: limit to 3 hops, 2-3 memories.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The agent output now includes logging (think blocks, tool calls)
before the final response. Search the tail instead of checking
only the last line.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The type field is near the start of JSONL objects. Scanning the
full object (potentially megabytes for tool_results) was the
bottleneck — TwoWaySearcher dominated the profile.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use memchr::memmem to check for "type":"user" or "type":"assistant"
in raw bytes before parsing. Avoids deserializing large tool_result
and system objects entirely.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TailMessages is a proper iterator that yields (role, text, timestamp)
newest-first. Owns the mmap internally. Caller decides when to stop.
resolve_conversation collects up to 200KB, then reverses to
chronological order. No compaction check needed — the byte budget
naturally limits how far back we scan.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces byte-by-byte backward iteration with memrchr3('{', '}', '"')
which uses SIMD to jump between structurally significant bytes. Major
speedup on large transcripts (1.4GB+).
Also simplifies tail_messages to use a byte budget (200KB) instead
of token counting.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Was parsing every object twice (compaction check + message extract)
and running contains_bytes on every object for the compaction marker.
Now: quick byte pre-filter for "user"/"assistant", parse once, check
compaction after text extraction.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverse-scans the mmap'd transcript using JsonlBackwardIter,
collecting user/assistant messages up to a token budget, stopping
at the compaction boundary. Returns messages in chronological order.
resolve_conversation() now uses this instead of parsing the entire
file through extract_conversation + split_on_compaction.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When poc-memory render is called inside a Claude session, add the
key to the seen set so the surface agent knows it's been shown.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fires on each UserPromptSubmit, reads the conversation via
{{conversation}}, checks {{seen_recent}} to avoid re-surfacing,
searches the memory graph, and outputs a key list or nothing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
{{conversation}} reads POC_SESSION_ID, finds the transcript, extracts
the last segment (post-compaction), returns the tail ~100K chars.
{{seen_recent}} merges current + prev seen files for the session,
returns the 20 most recently surfaced memory keys with timestamps.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Surface agent fires asynchronously on UserPromptSubmit, deposits
results for the next prompt to consume. This commit adds:
- poc-hook: spawn surface agent with PID tracking and configurable
timeout, consume results (NEW RELEVANT MEMORIES / NO NEW), render
and inject surfaced memories, observation trigger on conversation
volume
- memory-search: rotate seen set on compaction (current → prev)
instead of deleting, merge both for navigation roots
- config: surface_timeout_secs option
The .agent file and agent output routing are still pending.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All LLM calls now go through the direct API backend. Removes
call_model, call_model_with_tools, call_sonnet, call_haiku,
log_usage, and their dependencies (Command, prctl, watchdog).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
audit, digest, and compare now go through the API backend via
call_simple(), which logs to llm-logs/{caller}/.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pass the caller's log closure all the way through to api.rs instead
of creating a separate eprintln closure in llm.rs. Everything goes
through one stream — prompt, think blocks, tool calls with args,
tool results with content, token counts, final response.
CLI uses println (stdout), daemon uses its task log. No more split
between stdout and stderr.
Also removes the llm-log file creation from knowledge.rs — that's
the daemon's concern, not the agent runner's.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The original '>' detection was too broad and caught tool output lines.
Now we look for '> X: ' pattern (user prompt with speaker prefix) to
detect the start of a new user input, which marks the end of the
previous response.
The --block flag makes poc-agent read block until a complete response
is received (detected by a new user input line starting with '>'),
then exit. This enables smoother three-way conversations where one
instance can wait for the other's complete response without polling.
The implementation:
- Added cmd_read_inner() with block parameter
- Modified socket streaming to detect '>' lines as response boundaries
- Added --block CLI flag to Read subcommand
The --follow flag continues to stream indefinitely.
The --block flag reads one complete response and exits.
Neither flag exits immediately if there's no new output.
The hook now tracks transcript size and queues an observation agent
run every ~5K tokens (~20KB) of new conversation. This makes memory
formation reactive to conversation volume rather than purely daily.
Configurable via POC_OBSERVATION_THRESHOLD env var. The observation
agent's chunk_size (in .agent file) controls how much context it
actually processes per run.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move data sections before instructions (core at top, subconscious +
notes at bottom near task). Deduplicate guidelines that are now in
memory-instructions-core-subconscious. Compress verbose paragraphs
to bullet points.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>