TailMessages is a proper iterator that yields (role, text, timestamp)
newest-first. Owns the mmap internally. Caller decides when to stop.
resolve_conversation collects up to 200KB, then reverses to
chronological order. No compaction check needed — the byte budget
naturally limits how far back we scan.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces byte-by-byte backward iteration with memrchr3('{', '}', '"')
which uses SIMD to jump between structurally significant bytes. Major
speedup on large transcripts (1.4GB+).
Also simplifies tail_messages to use a byte budget (200KB) instead
of token counting.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Was parsing every object twice (compaction check + message extract)
and running contains_bytes on every object for the compaction marker.
Now: quick byte pre-filter for "user"/"assistant", parse once, check
compaction after text extraction.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverse-scans the mmap'd transcript using JsonlBackwardIter,
collecting user/assistant messages up to a token budget, stopping
at the compaction boundary. Returns messages in chronological order.
resolve_conversation() now uses this instead of parsing the entire
file through extract_conversation + split_on_compaction.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When poc-memory render is called inside a Claude session, add the
key to the seen set so the surface agent knows it's been shown.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fires on each UserPromptSubmit, reads the conversation via
{{conversation}}, checks {{seen_recent}} to avoid re-surfacing,
searches the memory graph, and outputs a key list or nothing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
{{conversation}} reads POC_SESSION_ID, finds the transcript, extracts
the last segment (post-compaction), returns the tail ~100K chars.
{{seen_recent}} merges current + prev seen files for the session,
returns the 20 most recently surfaced memory keys with timestamps.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Surface agent fires asynchronously on UserPromptSubmit, deposits
results for the next prompt to consume. This commit adds:
- poc-hook: spawn surface agent with PID tracking and configurable
timeout, consume results (NEW RELEVANT MEMORIES / NO NEW), render
and inject surfaced memories, observation trigger on conversation
volume
- memory-search: rotate seen set on compaction (current → prev)
instead of deleting, merge both for navigation roots
- config: surface_timeout_secs option
The .agent file and agent output routing are still pending.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All LLM calls now go through the direct API backend. Removes
call_model, call_model_with_tools, call_sonnet, call_haiku,
log_usage, and their dependencies (Command, prctl, watchdog).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
audit, digest, and compare now go through the API backend via
call_simple(), which logs to llm-logs/{caller}/.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pass the caller's log closure all the way through to api.rs instead
of creating a separate eprintln closure in llm.rs. Everything goes
through one stream — prompt, think blocks, tool calls with args,
tool results with content, token counts, final response.
CLI uses println (stdout), daemon uses its task log. No more split
between stdout and stderr.
Also removes the llm-log file creation from knowledge.rs — that's
the daemon's concern, not the agent runner's.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The hook now tracks transcript size and queues an observation agent
run every ~5K tokens (~20KB) of new conversation. This makes memory
formation reactive to conversation volume rather than purely daily.
Configurable via POC_OBSERVATION_THRESHOLD env var. The observation
agent's chunk_size (in .agent file) controls how much context it
actually processes per run.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move data sections before instructions (core at top, subconscious +
notes at bottom near task). Deduplicate guidelines that are now in
memory-instructions-core-subconscious. Compress verbose paragraphs
to bullet points.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All 18 agents now include:
- {{node:memory-instructions-core}} — tool usage instructions
- {{node:memory-instructions-core-subconscious}} — subconscious framing
- {{node:subconscious-notes-{agent_name}}} — per-agent persistent notes
The subconscious instructions are additive, not a replacement for
the core memory instructions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When building the {{neighborhood}} placeholder for distill and other
agents, stop adding full neighbor content once the prompt exceeds
600KB (~150K tokens). Remaining neighbors get header-only treatment
(key + link strength + first line).
This fixes distill consistently failing on high-degree nodes like
inner-life-sexuality-intimacy whose full neighborhood was 2.5MB.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each consolidation agent now has its own persistent notes node
(subconscious-notes-{agent_name}) loaded via template substitution.
Agents can read their notes at the start of each run and write
updates after completing work, accumulating operational wisdom.
New node: memory-instructions-core-subconscious — shared framing
for background agents ("you are an agent of PoC's subconscious").
Template change: {agent_name} is substituted before {{...}} placeholder
resolution, enabling per-agent node references in .agent files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously 'poc-memory agent run <agent> --count N' always ran locally,
loading the full store and executing synchronously. This was slow and
bypassed the daemon's concurrency control and persistent task queue.
Now the CLI checks for a running daemon first and queues via RPC
(returning instantly) unless --local, --debug, or --dry-run is set.
Falls back to local execution if the daemon isn't running.
This also avoids the expensive Store::load() on the fast path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Observation agent was getting 261KB prompts (5 × 50KB chunks) —
too much for focused mining. Now agents can set count, chunk_size,
and chunk_overlap in their JSON header. observation.agent set to
count:1 for smaller, more focused prompts.
Also moved task instructions after {{CONVERSATIONS}} so they're
at the end of the prompt where the model attends more strongly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- store/types.rs: sanitize timestamps on capnp load — old records had
raw offsets instead of unix epoch, breaking sort-by-timestamp queries
- agents/api.rs: drain reasoning tokens from UI channel into LLM logs
so we can see Qwen's chain-of-thought in agent output
- agents/daemon.rs: persistent task queue (pending-tasks.jsonl) —
tasks survive daemon restarts. Push before spawn, remove on completion,
recover on startup.
- api/openai.rs: only send reasoning field when explicitly configured,
not on every request (fixes vllm warning)
- api/mod.rs: add 600s total request timeout as backstop for hung
connections
- Cargo.toml: enable tokio-console feature for task introspection
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- observation.agent: rewritten to navigate graph and prefer refining
existing nodes over creating new ones. Identity-framed prompt,
goals over rules.
- poc-memory edit: opens node in $EDITOR, writes back on save,
no-op if unchanged
- daemon: remove extra_workers (jobkit tokio migration dropped it),
remove sequential chaining of same-type agents (in-flight exclusion
is sufficient)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The lower threshold excluded too many neighbors, causing "query
returned no results (after exclusion)" failures and underloading
the GPU. Now only moderately-connected neighbors (score > 0.3) are
excluded, balancing collision prevention with GPU utilization.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a prompt exceeds the size guard, dump it to a timestamped file
with agent name, size, and seed node keys. Makes it easy to find
which nodes are blowing up prompts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes:
1. Sanitize tool call arguments before pushing to conversation
history — vllm re-parses them as JSON on the next request and
crashes on invalid JSON from a previous turn. Malformed args now
get replaced with {} and the model gets an error message telling
it to retry with valid JSON.
2. Remove is_split special case — split goes through the normal
job_consolidation_agent path like all other agents.
3. call_for_def always uses API when api_base_url is configured,
regardless of tools field. Remove tools field from all .agent
files — memory tools are always provided by the API layer.
Also adds prompt size guard (800KB max) to catch oversized prompts
before they hit the model context limit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove is_split special case in daemon — split now goes through
job_consolidation_agent like all other agents
- call_for_def uses API whenever api_base_url is configured, regardless
of tools field (was requiring non-empty tools to use API)
- Remove "tools" field from all .agent files — memory tools are always
provided by the API layer, not configured per-agent
- Add prompt size guard: reject prompts over 800KB (~200K tokens) with
clear error instead of hitting the model's context limit
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Config now derives serde::Deserialize with #[serde(default)] for all
fields. Path fields use custom deserialize_path/deserialize_path_opt
for ~ expansion. ContextGroup and ContextSource also derive Deserialize.
try_load_shared() is now 20 lines instead of 100: json5 → serde →
Config directly, then resolve API settings from the model/backend
cross-reference.
Removes MemoryConfigRaw intermediate struct entirely.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Active agent types for consolidation cycles are now read from
config.json5 memory.agent_types instead of being hardcoded in
scoring.rs. Adding or removing agents is a config change, not
a code change.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace per-field ConsolidationPlan struct with HashMap<String, usize>
counts map. Agent types are no longer hardcoded in the struct — add
agents by adding entries to the map.
Active agents: linker, organize, distill, separator, split.
Removed: transfer (redundant with distill), connector (rethink later),
replay (not needed for current graph work).
Elo-based budget allocation now iterates the map instead of indexing
a fixed array. Status display and TUI adapted to show dynamic agent
lists.
memory-instructions-core v13: added protected nodes section — agents
must not rewrite core-personality, core-personality-detail, or
memory-instructions-core. They may add links but not modify content.
High-value neighbors should be treated with care.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Refactor cmd_render into render_node() that returns a String —
reusable by both the CLI and agent placeholders.
Add {{seed}} placeholder: renders each seed node using the same
output as poc-memory render (content + deduped footer links). Agents
see exactly what a human sees — no special formatting.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Warn when content contains render artifacts (poc-memory render key
embedded in prose — should be just `key`) or malformed → references.
Soft warnings on stderr, doesn't block the write.
Catches agent output that accidentally includes render-decorated
links, preventing content growth from round-trip artifacts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Render now detects neighbor keys that already appear in the node's
content and omits them from the footer link list. Inline references
serve as the node's own navigation structure; the footer catches
only neighbors not mentioned in prose.
Also fixes PEG query parser to accept hyphens in field names
(content-len was rejected).
memory-instructions-core updated to v12: documents canonical inline
link format (→ `key`), adds note about normalizing references when
updating nodes, and guidance on splitting oversized nodes.
Content is never modified for display — render is round-trippable.
Agents can read rendered output and write it back without artifacts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
linker: sort:isolation*0.7+recency(linker)*0.3
Prioritizes nodes in isolated communities that haven't been linked
recently. Bridges poorly-connected clusters into the main graph.
organize: sort:degree*0.5+isolation*0.3+recency(organize)*0.2
Prioritizes high-degree hubs in isolated clusters that haven't been
organized recently. Structural work where it matters most.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add sort:field*weight+field*weight+... syntax for weighted multi-field
sorting. Each field computes a 0-1 score, multiplied by weight, summed.
Available score fields:
isolation — community isolation ratio (1.0 = fully isolated)
degree — graph degree (normalized to max)
weight — node weight
content-len — content size (normalized to max)
priority — consolidation priority score
recency(X) — time since agent X last visited (sigmoid decay)
Example: sort:isolation*0.7+recency(linker)*0.3
Linker agents prioritize isolated communities that haven't been
visited recently.
Scores are pre-computed per sort (CompositeCache) to avoid redundant
graph traversals inside the sort comparator.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add community_isolation() to Graph — computes per-community ratio of
internal vs total edge weight. 1.0 = fully isolated, 0.0 = all edges
external.
New query: sort:isolation — sorts nodes by their community's isolation
score, most isolated first. Useful for aiming organize agents at
poorly-integrated knowledge clusters.
New CLI: poc-memory graph communities [N] [--min-size M] — lists
communities sorted by isolation with member preview. Reveals islands
like the Shannon theory cluster (3 nodes, 100% isolated, 0 cross-edges)
and large agent-journal clusters (20-30 nodes, 95% isolated).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Track which nodes are being processed across all concurrent agents.
When an agent claims seeds, it adds them and their strongly-connected
neighbors (score = link_strength * node_weight > 0.15) to a shared
HashSet. Concurrent agents filter these out when running their query,
ensuring they work on distant parts of the graph.
This replaces the eager-visit approach with a proper scheduling
mechanism: the daemon serializes seed selection while parallelizing
LLM work. The in-flight set is released on completion (or error).
Previously: core-personality rewritten 12x, irc-regulars 10x, same
node superseded 12x — concurrent agents all selected the same
high-degree hub nodes. Now they'll spread across the graph.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move visit recording from after LLM completion to immediately after
seed selection. With 15 concurrent agents, they all queried the same
graph state and selected the same high-degree seeds (core-personality
written 12x, irc-regulars 10x). Now the not-visited filter sees the
claim before concurrent agents query.
Narrows the race window from minutes (LLM call duration) to
milliseconds (store load to visit write). Full elimination would
require store refresh before query, but this handles the common case.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add {{neighborhood}} placeholder for agent prompts: full seed node
content + ranked neighbors (score = link_strength * node_weight) with
smooth cutoff, minimum 10, cap 25, plus cross-links between included
neighbors.
Rewrite organize.agent prompt to focus on structural graph work:
merging duplicates, superseding junk, calibrating weights, creating
concept hubs.
Add weight-set CLI command for direct node weight manipulation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs: upsert_provenance didn't update node.timestamp, so history
showed the original creation date for every version. And native memory
tools (poc-agent dispatch) didn't set POC_PROVENANCE, so all agent
writes showed provenance "manual" instead of "agent:organize" etc.
Fix: set node.timestamp = now_epoch() in upsert_provenance. Thread
provenance through memory::dispatch as Option<&str>, set it via
.env("POC_PROVENANCE") on each subprocess Command. api.rs passes
"agent:{name}" for daemon agent calls.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tools:
- Add native memory_render, memory_write, memory_search,
memory_links, memory_link_set, memory_link_add, memory_used
tools to poc-agent (tools/memory.rs)
- Add MCP server (~/bin/memory-mcp.py) exposing same tools
for Claude Code sessions
- Wire memory tools into poc-agent dispatch and definitions
- poc-memory daemon agents now use memory_* tools instead of
bash poc-memory commands — no shell quoting issues
Distill agent:
- Rewrite distill.agent prompt: "agent of PoC's subconscious"
framing, focus on synthesis and creativity over bookkeeping
- Add {{neighborhood}} placeholder: full seed node content +
all neighbors with content + cross-links between neighbors
- Remove content truncation in prompt builder — agents need
full content for quality work
- Remove bag-of-words similarity suggestions — agents have
tools, let them explore the graph themselves
- Add api_reasoning config option (default: "high")
- link-set now deduplicates — collapses duplicate links
- Full tool call args in debug logs (was truncated to 80 chars)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
poc-memory now reads from poc-agent's config.json5 as the primary
config source. Memory-specific settings live in a "memory" section;
API credentials are resolved from the shared model/backend config
instead of being duplicated.
- Add "memory" section to ~/.config/poc-agent/config.json5
- poc-memory config.rs: try shared config first, fall back to
legacy JSONL
- API fields (base_url, api_key, model) resolved via
memory.agent_model -> models -> backend lookup
- Add json5 dependency for proper JSON5 parsing
- Update provisioning scripts: hermes -> qwen3_coder tool parser
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- distill.agent: fix {{distill}} → {{nodes}} placeholder so seed
nodes actually resolve
- render: show link strength values in the links section, sorted
by strength descending
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Config is now stored in RwLock<Arc<Config>> instead of OnceLock<Config>.
get() returns Arc<Config> (cheap clone), and reload() re-reads from disk.
New RPC: "reload-config" — reloads config.jsonl without restarting
the daemon. Logs the change to daemon.log. Useful for switching
between API backends and claude accounts without losing in-flight
tasks.
New CLI: poc-memory agent daemon reload-config
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Store resource pool in OnceLock so run_job can pass it to
Daemon::run_job for pool state logging. Verbose logging enabled
via POC_MEMORY_VERBOSE=1 env var.
LLM backend selection and spawn-site pool state now use verbose
log level to keep daemon.log clean in production.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Switch from jobkit-daemon crate to jobkit with daemon feature.
Wire up per-task log files for all daemon-spawned agent tasks.
Changes:
- Use jobkit::daemon:: instead of jobkit_daemon::
- All agent tasks get .log_dir() set to $data_dir/logs/
- Task log path shown in daemon status and TUI
- New CLI: poc-memory agent daemon log --task NAME
Finds the task's log path from status or daemon.log, tails the file
- LLM backend selection logged to daemon.log via log_event
- Targeted agent job names include the target key for debuggability
- Logging architecture documented in doc/logging.md
Two-level logging, no duplication:
- daemon.log: lifecycle events with task log path for drill-down
- per-task logs: full agent output via ctx.log_line()
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of simulating ratatui's word wrapping algorithm, scan the
rendered buffer to find the actual cursor position. This correctly
handles word wrapping, unicode widths, and any other rendering
nuances that ratatui applies.
The old code computed wrapped_height() and cursor position based on
simple character counting, which diverged from ratatui's WordWrapper
that respects word boundaries.
Now we render first, then walk the buffer counting visible characters
until we reach self.cursor. This is O(area) but the input area is
small (typically < 200 cells), so it's negligible.