- Add compact_timestamp() to store — replaces 5 copies of
format_datetime(now_epoch()).replace([':', '-', 'T'], "")
Also fixes missing seconds (format_datetime only had HH:MM).
- Add ConsolidationPlan::to_agent_runs() — replaces identical
plan-to-runs-list expansion in consolidate.rs and daemon.rs.
- Port job_rename_agent to use run_one_agent — eliminates manual
prompt building, LLM call, report storage, and visit recording
that duplicated the shared pipeline.
- Rename Confidence::weight()/value() to delta_weight()/gate_value()
to clarify the distinction (delta metrics vs depth gating).
Three places duplicated the agent execution loop (build prompt → call
LLM → store output → parse actions → record visits): consolidate.rs,
knowledge.rs, and daemon.rs. Extract into run_one_agent() in
knowledge.rs that all three now call.
Also standardize consolidation agent prompts to use WRITE_NODE/LINK/REFINE
— the same commands the parser handles. Previously agents output
CATEGORIZE/NOTE/EXTRACT/DIGEST/DIFFERENTIATE/MERGE/COMPRESS which were
silently dropped after the second-LLM-call removal.
The consolidation pipeline previously made a second Sonnet call to
extract structured JSON actions from agent reports. This was both
wasteful (extra LLM call per consolidation) and lossy (only extracted
links and manual items, ignoring WRITE_NODE/REFINE).
Now actions are parsed and applied inline after each agent runs, using
the same parse_all_actions() parser as the knowledge loop. The daemon
scheduler's separate apply phase is also removed.
Also deletes 8 superseded/orphaned prompt .md files (784 lines) that
have been replaced by .agent files.
These prompts are now embedded in their .agent files or no longer
called from any code path.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
The four knowledge agents (observation, extractor, connector,
challenger) were hardcoded in knowledge.rs with their own node
selection logic that bypassed the query pipeline and visit tracking.
Now they're .agent files like the consolidation agents:
- extractor: not-visited:extractor,7d | sort:priority | limit:20
- observation: uses new {{CONVERSATIONS}} placeholder
- connector: type:semantic | not-visited:connector,7d
- challenger: type:semantic | not-visited:challenger,14d
The knowledge loop's run_cycle dispatches through defs::run_agent
instead of calling hardcoded functions, so all agents get visit
tracking automatically. This means the extractor now sees _facts-*
and _mined-transcripts nodes that it was previously blind to.
~200 lines of dead code removed (old runner functions, spectral
clustering for node selection, per-agent LLM dispatch).
New placeholders in defs.rs:
- {{CONVERSATIONS}} — raw transcript fragments for observation agent
- {{TARGETS}} — alias for {{NODES}} (challenger compatibility)
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Adds `poc-memory daemon run-agent <type> <count>` CLI command that
sends an RPC to the daemon to queue agent runs, instead of spawning
separate processes.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
The parser was using split_once("\n\n") which broke when the prompt
started immediately after the JSON header (no blank line). Parse
the first line as JSON, treat the rest as the prompt body.
All agents now go through the config-driven path via .agent files.
agent_prompt() just delegates to defs::run_agent(). Remove the 100+
line hardcoded match block and the _pub wrapper functions — make the
formatters pub directly.
Replace the formatter dispatch with a generic {{placeholder}} lookup
system. Placeholders in prompt templates are resolved at runtime from
a table: topology, nodes, episodes, health, pairs, rename, split.
The query in the header selects what to operate on (keys for visit
tracking); placeholders pull in formatted context. Placeholders that
produce their own node selection (pairs, rename) contribute keys back.
Port health, separator, rename, and split agents to .agent files.
All 7 agents now use the config-driven path.
Each agent is a .agent file: JSON config on the first line, blank line,
then the raw prompt markdown. Fully self-contained, fully readable.
No separate template files needed.
Agents dir: checked into repo at poc-memory/agents/. Code looks there
first (via CARGO_MANIFEST_DIR), falls back to ~/.claude/memory/agents/.
Three agents migrated: replay, linker, transfer.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
New append-only visits.capnp log records which agent processed which
node and when. Only recorded on successful completion — transient
errors don't mark nodes as "seen."
Schema: AgentVisit{nodeUuid, nodeKey, agent, timestamp, outcome}
Storage: append_visits(), replay_visits(), in-memory VisitIndex
Recording: daemon records visits after successful LLM call
API: agent_prompt() returns AgentBatch{prompt, node_keys} so callers
know which nodes to mark as visited.
Groundwork for using visit recency in agent node selection — agents
will deprioritize recently-visited nodes.
New `poc-memory dedup` command (--apply for live run, dry-run by
default). Finds nodes sharing the same key but different UUIDs,
classifies them as identical or diverged, picks a survivor
(prefer most edges, then highest version), tombstones the rest,
and redirects all edges from doomed UUIDs to the survivor.
All write paths (upsert_node, upsert_provenance, delete_node,
rename_node, ingest_units) now hold StoreLock across the full
refresh→check→write cycle. This prevents the race where two
concurrent processes both see a key as "new" and create separate
UUIDs for it.
Adds append_nodes_unlocked() and append_relations_unlocked() for
callers already holding the lock. Adds refresh_nodes() to replay
log tail under lock before deciding create vs update.
Also adds find_duplicates() for detecting existing duplicates
in the log (replays full log, groups live nodes by key).
- Refactor split from serial batch to independent per-node tasks
(run-agent split N spawns N parallel tasks, gated by llm_concurrency)
- Replace cosine similarity edge inheritance with agent-assigned
neighbors in the plan JSON — the LLM already understands the
semantic relationships, no need to approximate with bag-of-words
- Add --strict-mcp-config to claude CLI calls to skip MCP server
startup (saves ~5s per call)
- Remove hardcoded 2000-char split threshold — let the agent decide
what's worth splitting
- Reload store before mutations to handle concurrent split races
Episodic nodes (journal entries, digests) are narratives that should
not be split even when large. Only semantic reference nodes that have
grown to cover multiple topics are candidates.
Phase 1 sends a large node with its neighbor communities to the LLM
and gets back a JSON split plan (child keys, descriptions, section
hints). Phase 2 fires one extraction call per child in parallel —
each gets the full parent content and extracts/reorganizes just its
portion.
This handles arbitrarily large nodes because output is always
proportional to one child, not the whole parent. Tested on the kent
node (19K chars → 3 children totaling 20K chars with clean topic
separation).
New files:
prompts/split-plan.md — phase 1 planning prompt
prompts/split-extract.md — phase 2 extraction prompt
prompts/split.md — original single-phase (kept for reference)
Modified:
agents/prompts.rs — split_candidates(), split_plan_prompt(),
split_extract_prompt(), agent_prompt "split" arm
agents/daemon.rs — job_split_agent() two-phase implementation,
RPC dispatch for "split" agent type
tui.rs — added "split" to AGENT_TYPES
New consolidation agent that reads node content and generates semantic
3-5 word kebab-case keys, replacing auto-generated slugs (5K+ journal
entries with truncated first-line slugs, 2.5K mined transcripts with
opaque UUIDs).
Implementation:
- prompts/rename.md: agent prompt template with naming conventions
- prompts.rs: format_rename_candidates() selects nodes with long
auto-generated keys, newest first
- daemon.rs: job_rename_agent() parses RENAME actions from LLM
output and applies them directly via store.rename_node()
- Wired into RPC handler (run-agent rename) and TUI agent types
- Fix epoch_to_local panic on invalid timestamps (fallback to UTC)
Rename dramatically improves search: key-component matching on
"journal#2026-02-28-violin-dream-room" makes the node findable by
"violin", "dream", or "room" — the auto-slug was unsearchable.
Per-agent-type tabs (health, replay, linker, separator, transfer,
apply, orphans, cap, digest, digest-links, knowledge) with dynamic
visibility — tabs only appear when tasks or log history exist.
Features:
- Overview tab: health gauges (α, gini, cc, episodic%), in-flight
tasks, and recent log entries
- Pipeline tab: table with phase ordering and status
- Per-agent tabs: active tasks, output logs, log history
- Log tab: auto-scrolling daemon.log tail
- Vim-style count prefix: e.g. 5r runs 5 iterations of the agent
- Flash messages for RPC feedback
- Tab/Shift-Tab navigation, number keys for tab selection
Also adds run-agent RPC to the daemon: accepts agent type and
iteration count, spawns chained tasks with LLM resource pool.
poc-memory status launches TUI when stdout is a terminal and daemon
is running, falls back to text output otherwise.
match_seeds() previously only found nodes whose keys exactly matched
search terms. This meant searches like "formal verification" or
"bcachefs plan" returned nothing — no nodes are keyed with those
exact strings.
Three-tier matching strategy:
1. Exact key match (full weight) — unchanged
2. Key component match (0.5× weight) — split keys on -/_/./#,
match individual words. "plan" now finds "the-plan", "verification"
finds "c-to-rust-verification-workflow", etc.
3. Content match (0.2× weight, capped at 50 hits) — search node
content for terms that didn't match any key. Catches nodes whose
keys are opaque but whose content is relevant.
Also adds prompt-based seeding to the hook pipeline: extract_query_terms
from the user's prompt and merge into the term set. Previously the hook
only seeded from transcript scanning (finding node keys as substrings
in conversation history), which meant fresh sessions or queries about
new topics produced no search results at all.
Wire up the PostToolUse handler to call memory-search --hook, passing
through the hook JSON on stdin. This drains pending context chunks
saved by the initial UserPromptSubmit load, delivering them one per
tool call until all chunks are delivered.
Claude Code's hook output limit (~10K chars) was truncating the full
context load. Split output into chunks at section boundaries, deliver
first chunk on UserPromptSubmit, save remaining chunks to disk for
drip-feeding on subsequent PostToolUse calls.
Two-pass algorithm: split at "--- KEY (group) ---" boundaries, then
merge adjacent small sections up to 9K per chunk. Separates session_id
guard (needed for chunk state) from prompt guard (needed only for
search), so PostToolUse events without a prompt can still pop chunks.
mark_returned() was append-only without checking if the key already
existed, causing duplicates to accumulate across hook invocations.
load_returned() then returned all entries including duplicates, which
made the returned count exceed the seen count, causing a u64 underflow
in the pre-seeded calculation.
Fix: check load_returned() before appending in mark_returned(), dedup
on read in load_returned(), and use saturating_sub for the pre-seeded
count as a safety net.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Spawn memory-search --hook as a subprocess, piping the hook input
JSON through stdin and printing its stdout. This ensures memory
context injection goes through the same hook whose output Claude
Code reliably persists, fixing the issue where memory-search as a
separate hook had its output silently dropped.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Move JsonlBackwardIter and find_last_compaction() from
parse-claude-conversation into a shared transcript module. Both
memory-search and parse-claude-conversation now use the same robust
compaction detection: mmap-based backward scan, JSON parsing to
verify user-type message, content prefix check.
Replaces memory-search's old detect_compaction() which did a forward
scan with raw string matching on "continued from a previous
conversation" — that could false-positive on the string appearing
in assistant output or tool results.
Add parse-claude-conversation as a new binary for debugging what's
in the context window post-compaction.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
chrono's timestamp_opt can return None during DST transitions.
Handle all three variants (Single, Ambiguous, None) instead of
unwrapping. For DST gaps, offset by one hour to land in valid
local time.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Replace monolithic consolidate job with individual agent jobs
(replay, linker, separator, transfer, health) that run sequentially
and store reports. Multi-phase daily pipeline: agent runs → apply
actions → link orphans → cap degree → digest → digest links →
knowledge loop.
Add GraphHealth struct with graph metrics (alpha, gini, clustering
coefficient, episodic ratio) computed during health checks. Display
in `poc-memory daemon status`. Use cached metrics to build
consolidation plan without expensive O(n²) interference detection.
Add RPC consolidate command to trigger consolidation via socket.
Harden session watcher: skip transcripts with zero segments, improve
migration error handling.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Node weight no longer gates signal propagation — only edge_decay
and edge_strength affect traversal. Node weight is applied at the
end for ranking. This lets low-weight nodes serve as bridges
without killing the signal passing through them.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
All seeds emit at once. At each hop, activations from all sources
sum at each node, and the combined map propagates on the next hop.
Nodes where multiple wavefronts overlap get reinforced and radiate
stronger — natural interference patterns.
Lower default min_activation threshold (×0.1) since individual
contributions are smaller in additive mode.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Initialize direction from the two most spectrally separated seeds
instead of relying on input order (which was alphabetical from
BTreeMap). Run 3 rounds of power iteration with normalization
instead of 1 for better convergence.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Three new composable search stages:
confluence — multi-source spreading activation. Unlike spread (which
takes max from any source), confluence rewards nodes reachable from
multiple seeds additively. Naturally separates unrelated seed groups
since their neighborhoods don't overlap. Params: max_hops, edge_decay,
min_sources.
geodesic — straightest path between seed pairs in spectral space.
At each graph hop, picks the neighbor whose spectral direction most
aligns with the target (cosine similarity of direction vectors).
Nodes on many geodesic paths score highest. Params: max_path, k.
manifold — extrapolation along the direction seeds define. Computes
weighted centroid + principal axis of seeds in spectral space, then
scores candidates by projection onto that axis (penalized by
perpendicular distance). Finds what's "further along" rather than
"nearby." Params: k.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Replace O(n²) Vec::contains + sort/dedup with O(n) HashSet for orphan
node tracking in health_report(). Use imported HashMap type instead of
fully-qualified std::collections::HashMap.
- New agents/transcript.rs: shared JSONL parsing for enrich, fact_mine,
and knowledge (was 3 separate implementations, ~150 lines duplicated)
- New best_match() and section_children() helpers in neuro/rewrite.rs
(was duplicated find-best-by-similarity loop + section collection)
- Net -153 lines
- Replace `pub use types::*` in store/mod.rs with explicit re-export list
- Make transcript_dedup_key private in agents/enrich.rs (only used internally)
- Inline duplicated projects_dir() helper in agents/knowledge.rs and daemon.rs
Replace all partial_cmp().unwrap() with total_cmp() in spectral.rs
and knowledge.rs — eliminates potential panics on NaN without
changing behavior for normal floats.
Use existing weighted_distance() and eigenvalue_weights() helpers in
nearest_neighbors() and nearest_to_seeds() instead of inlining the
same distance computation.
Move parse_timestamp_to_epoch() from enrich.rs to util.rs — was
duplicated logic, now shared.
Replace O(n²) relation existence check in init_from_markdown() with
a HashSet of (source, target) UUID pairs. With 26K relations this
was scanning linearly for every link in every markdown unit.
Move prompts_dir into Config (was hardcoded ~/poc/memory/prompts).
Replace hardcoded ~/.claude/memory paths in spectral.rs, graph.rs,
and main.rs with store::memory_dir() or config::get(). Replace
hardcoded ~/.claude/projects in knowledge.rs and main.rs with
config::get().projects_dir.
Extract apply_agent_file() from cmd_apply_agent() — separates
file scanning from per-file JSON parsing and link application.
Add util::truncate() and util::first_n_chars() to replace 16 call
sites doing the same floor_char_boundary or chars().take().collect()
patterns. Deduplicate the batching loop in consolidate.rs (4 copies
→ 1 loop over an array). Fix all clippy warnings: redundant closures,
needless borrows, collapsible if, unnecessary cast, manual strip_prefix.
Net: -44 lines across 16 files.
Replace hand-rolled argument parsing (match on args[1], manual
iteration over &[String]) with Clap's derive macros. All 60+
subcommands now have typed arguments with defaults, proper help
text, and error messages generated automatically.
The 83-line usage() function is eliminated — Clap generates help
from the struct annotations. Nested subcommands (digest daily/
weekly/monthly/auto, journal-tail --level) use Clap's subcommand
nesting naturally.
poc-daemon (notification routing, idle timer, IRC, Telegram) was already
fully self-contained with no imports from the poc-memory library. Now it's
a proper separate crate with its own Cargo.toml and capnp schema.
poc-memory retains the store, graph, search, neuro, knowledge, and the
jobkit-based memory maintenance daemon (daemon.rs).
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Category was a manually-assigned label with no remaining functional
purpose (decay was the only behavior it drove, and that's gone).
Remove the enum, its methods, category_counts, the --category search
filter, and all category display. The field remains in the capnp
schema for backwards compatibility but is no longer read or written.
Status and health reports now show NodeType breakdown (semantic,
episodic, daily, weekly, monthly) instead of categories.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Replace hardcoded "identity" lookups with config.core_nodes so
experience mining and init work with whatever core nodes are
configured, not just a node named "identity".
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Graph-wide decay is the wrong approach — node importance should emerge
from graph topology (degree, centrality, usage patterns), not a global
weight field multiplied by a category-specific factor.
Remove: Store::decay(), Store::categorize(), Store::fix_categories(),
Category::decay_factor(), cmd_decay, cmd_categorize, cmd_fix_categories,
job_decay, and all category assignments at node creation time.
Category remains in the schema as a vestigial field (removing it
requires a capnp migration) but no longer affects behavior.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Replace key prefix matching (journal#j-, daily-, weekly-, monthly-)
with NodeType filters (EpisodicSession, EpisodicDaily, EpisodicWeekly,
EpisodicMonthly) for all queries: journal-tail, digest gathering,
digest auto-detection, experience mining dedup, and find_journal_node.
Add EpisodicMonthly to NodeType enum and capnp schema.
Key naming conventions (journal#j-TIMESTAMP-slug, daily-DATE, etc.)
are retained for key generation — the fix is about how we find nodes,
not how we name them.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
All nodes in the store are memory — none should be excluded from
knowledge extraction, search, or graph algorithms by name. Removed
the MEMORY/where-am-i/work-queue/work-state skip lists entirely.
Deleted where-am-i and work-queue nodes from the store (ephemeral
scratchpads that don't belong). Added orphan edge pruning to fsck
so broken links get cleaned up automatically.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>