Agent subprocess calls now set POC_PROVENANCE=agent:{name} so any
nodes/links created via tool calls are tagged with the creating agent.
This makes agent transcripts indistinguishable from conscious sessions
in format — important for future model training.
new_relation() now reads POC_PROVENANCE env var directly (raw string,
not enum) since agent names are dynamic.
link-add now computes initial strength from Jaccard similarity instead
of hardcoded 0.8. New links start at a strength reflecting actual
neighborhood overlap.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Some Sonnet runs preemptively refuse to use tools ("poc-memory tool
needs approval") without attempting to run them. Adding explicit
instruction that tools are pre-approved and should be used directly.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Add adjust_edge_strength() to Store — modifies strength on all edges
between two nodes, clamped to [0.05, 0.95].
New commands:
- `not-relevant KEY` — weakens ALL edges to the node by 0.01
(bad routing: search found the wrong thing)
- `not-useful KEY` — weakens node weight, not edges
(bad content: search found the right thing but it's not good)
Enhanced `used KEY` — now also strengthens all edges to the node by
0.01, in addition to the existing node weight boost.
Three-tier design: agents adjust by 0.00001 (automatic), conscious
commands adjust by 0.01 (deliberate), manual override sets directly.
All clamped, never hitting 0 or 1.
Design spec: .claude/analysis/2026-03-14-link-strength-feedback.md
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Add jaccard() and jaccard_strengths() to Graph. Jaccard similarity
measures neighborhood overlap between linked nodes — nodes sharing
many neighbors get stronger links, nodes with no shared neighbors
get weak links.
New subcommand: `poc-memory graph normalize-strengths [--apply]`
Scales raw Jaccard (typically 0.0-0.3) to useful range via j*3
clamped to [0.1, 1.0]. Skips implicit temporal edges (strength=1.0).
Applied to 64,969 edges. Distribution is bimodal: large cluster at
0.1-0.2 (weak) and spike at 0.9-1.0 (strong), with smooth gradient
between. Replaces the meaningless 0.3/0.8 split from manual/agent
creation methods.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Create jobkit-daemon crate with generic daemon infrastructure:
- event_log: JSONL append with size-based rotation
- socket: Unix domain socket RPC client and server with signal handling
- status: JSON status file read/write
Migrate daemon.rs to use the library:
- Worker pool setup via Daemon::new()
- Socket loop + signal handling via Daemon::run()
- RPC handlers as registered closures
- Logging, status writing, send_rpc all delegate to library
Migrate tui.rs to use socket::send_rpc() instead of inline UnixStream.
daemon.rs: 1952 → 1806 lines (-146), old status_socket_loop removed.
tui.rs: socket boilerplate removed.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Linker: give it Bash(poc-memory:*) tools so it can render nodes,
query neighbors, and search before creating. Adds search-before-create
discipline to reduce redundant node creation.
Organize: remove MERGE operation, make DELETE conservative (only true
duplicates or garbage). Add "Preserve diversity" rule — multiple nodes
on similar topics are features, not bugs. LINK is primary operation.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Rebalance consolidation scoring to be linker-heavy:
- 50 replay + 100 linker for extreme hub dominance (was 10+5)
- High gini now adds linker instead of replay
- Agent runs interleave types round-robin (linker, replay, linker...)
instead of running all of one type then all of another
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Compute parent/child (session→daily→weekly→monthly) and prev/next
(chronological ordering within each level) edges at graph build time
from node metadata. Parse dates from keys for digest nodes (whose
timestamps reflect creation, not covered date) and prefer key-parsed
dates over timestamp-derived dates for sessions (timezone fix).
Result: ~9185 implicit edges, communities halved, gini improved.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Previously the organize agent received a pre-computed cluster from a
term search — 69% of runs produced 0 actions because the same clusters
kept being found via different entry points.
Now: seed nodes shown with content previews and neighbor lists. Agent
uses tools (render, query neighbors, search) to explore outward and
discover what needs organizing. Visit filter set to 24h cooldown.
Prompt rewritten to encourage active exploration rather than static
cluster analysis.
Persistent cursor into the knowledge graph with navigation:
- temporal: forward/back among same-type nodes by timestamp
- hierarchical: up/down the digest tree (journal→daily→weekly→monthly)
- spatial: graph neighbor display at every position
The cursor file (~/.claude/memory/cursor) holds a single node key.
Show displays: temporal arrows, hierarchy links, semantic neighbors,
and full content. Date extraction from both timestamps and key names
handles the mixed-timestamp data gracefully.
This is the start of place cells — spatial awareness of position
in your own knowledge.
When generating a digest, automatically link all source entries to the
digest node (journal entries → daily, dailies → weekly, weeklies →
monthly). This builds the temporal spine of the graph — previously
~4000 journal entries were disconnected islands unreachable by recall.
Rewrote digest prompt to produce narrative rather than reports:
capture the feel, the emotional arc, what it was like to live through
it. Letter to future self, not a task log.
Moved prompt to digest.agent file alongside other agent definitions.
Falls back to prompts/digest.md if agent file not found.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Keys containing # are now pre-quoted in all cluster output (similarity
scores, hub analysis, node headers) so the agent copies them correctly
into bash commands. Prompt strengthened with CRITICAL warning about #
being a shell comment character.
Journal entries included in clusters but identified by node_type
(EpisodicSession) rather than key prefix, and tagged [JOURNAL — no
delete] in the output. Prompt rule 3b tells agent to LINK/REFINE
journals but never DELETE them. Digest nodes (daily/weekly/monthly)
still excluded entirely from clusters.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Add progress callback to run_one_agent and run_and_apply so callers
can see: prompt size, node list, LLM call timing, parsed action
count, and per-action applied/skipped status. Daemon writes these
to the persistent event log via log_event.
Cap organize cluster to 20 nodes - 126 nodes produced a 682KB
prompt that timed out every time. Agent has tools to explore
further if needed. Restore general query for production runs.
Previous prompt was too documentation-heavy — agent pattern-matched
on example placeholders instead of doing actual work. New prompt:
structured as direct instructions, uses {{organize}} placeholder
for pre-computed cluster data, three clear decision paths (merge,
differentiate, keep both), numbered rules.
Convert daemon from hand-rolled string dispatch to proper clap
Subcommand enum with typed args. Add custom top-level help that
expands nested subcommands (same pattern as bcachefs-tools), so
`poc-memory --help` shows full paths like `agent daemon run`.
Add call_for_def() that threads model and tools from agent definitions
through to claude CLI. Tool-enabled agents get --allowedTools instead
of --tools "" and a longer 15-minute timeout for multi-turn work.
Add ActionKind::Delete with parse/apply support so agents can delete
nodes (used by organize agent for deduplication).
Use call_for_def() in run_one_agent instead of hardcoded call_sonnet.
Add `poc-memory graph organize TERM` diagnostic that finds nodes
matching a search term, computes pairwise cosine similarity, reports
connectivity gaps, and optionally creates anchor nodes.
Add organize.agent definition that uses Bash(poc-memory:*) tool access
to explore clusters autonomously — query selects highest-degree
unvisited nodes, agent drives its own iteration via poc-memory CLI.
Add {{organize}} placeholder in defs.rs for inline cluster resolution.
Add `tools` field to AgentDef/AgentHeader so agents can declare
allowed tool patterns (passed as --allowedTools to claude CLI).
Two changes:
1. New -q/--query flag for direct search without hook machinery.
Useful for debugging: memory-search -q inner-life-sexuality-intimacy
shows seeds, spread results, and rankings.
2. Prompt key boost: when the current prompt contains a node key
(>=5 chars) as a substring, boost that term by +10.0. This ensures
explicit mentions fire as strong seeds for spread, while the graph
still determines what gets pulled in.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
New placeholder that expands query keys one hop through the graph,
giving agents visibility into what's already connected to the nodes
they're working on. Excludes the query keys themselves so there's
no duplication with {{nodes}}.
Added to transfer (sees existing semantic nodes linked to episodes,
so it REFINEs instead of duplicating) and challenger (sees neighbor
context to find real evidence for/against claims).
Also removes find_existing_observations — superseded by the
per-segment dedup fix and this general-purpose placeholder.
When building the {{conversations}} placeholder for the observation
agent, search for existing nodes relevant to each conversation
fragment and include them in the prompt. Uses seed matching + one-hop
graph expansion to find the neighborhood, so the extractor sees what
the graph already knows about these topics.
This helps prevent duplicate extractions, but the deeper bug is that
select_conversation_fragments doesn't track which conversations have
already been processed — that's next.
The observation agent was re-extracting the same conversations every
consolidation run because select_conversation_fragments had no tracking
of what had already been processed.
Extract shared helpers from the fact miner's dedup pattern:
- transcript_key(prefix, path): namespaced key from prefix + filename
- segment_key(base, idx): per-segment key
- keys_with_prefix(prefix): bulk lookup from store
- unmined_segments(path, prefix, known): find unprocessed segments
- mark_segment(...): mark a segment as processed
Rewrite select_conversation_fragments to use these with
_observed-transcripts prefix. Each compaction segment within a
transcript is now tracked independently — new segments from ongoing
sessions get picked up, already-processed segments are skipped.
When connectivity shows isolated nodes, print copy-pasteable
poc-memory graph link-add commands targeting the highest-degree
node in the largest cluster. Closes the diagnose→fix loop.
BFS-based connectivity analysis as a query pipeline stage. Shows
connected components, islands, and sample paths between result nodes
through the full graph (max 4 hops).
poc-memory query "content ~ 'made love' | connectivity"
poc-memory query "(content ~ 'A' OR content ~ 'B') | connectivity"
Also documented in query --help.
render now appends a links section showing up to 15 neighbors as
copy-pasteable `poc-memory render` commands, making the graph
naturally walkable without memorizing node keys.
query --help now documents the full language: expressions, fields,
operators, pipe stages, functions, and examples. Inline help in
cmd_query replaced with pointer to --help.
Load store from both cache (rkyv/bincode) and raw capnp logs,
then diff: missing nodes, phantom nodes, version mismatches.
Auto-rebuilds cache if inconsistencies found.
This would have caught the mysterious the-plan deletion — likely
caused by a stale/corrupt snapshot that silently dropped the node
while the capnp log still had it.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
The Provenance enum couldn't represent agents defined outside the
source code. Replace it with a Text field in the capnp schema so any
agent can write its own provenance label (e.g. "extractor:write",
"rename:tombstone") without a code change.
Schema: rename old enum fields to provenanceOld, add new Text
provenance fields. Old enum kept for reading legacy records.
Migration: from_capnp_migrate() falls back to old enum when the
new text field is empty.
Also adds `poc-memory tail` command for viewing recent store writes.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Nodes actively found by search now show "Search hits: N ← actively
found by search, prefer to keep" in both the node section (seen by
extractor, linker, etc.) and rename candidate listings.
Extractor and rename prompts updated to respect this signal — merge
into high-hit nodes rather than demoting them, skip renaming nodes
that are working well in search.
memory-search now records which nodes it finds via the daemon's
record-hits RPC endpoint. The daemon owns the redb database
exclusively, avoiding file locking between processes.
The rename agent reads hit counts to deprioritize nodes that are
actively being found by search — renaming them would break working
queries. Daily check decays counters by 10% so stale hits fade.
Also switched RPC command reading from fixed 256-byte buffer to
read_to_string for unbounded command sizes.
First use case: search hit tracking for rename protection. Nodes
that memory-search actively finds shouldn't be renamed.
The counters module provides increment/read/decay operations backed
by redb (pure Rust, ACID, no C deps). Next step: wire into the
poc-memory daemon via RPC so the daemon owns the DB exclusively
and memory-search sends hits via RPC.
Also reverts the JSONL search-hits approach in favor of this.
Instead of a hard 7-day cutoff, sort rename candidates so the
least-recently visited come first. Naturally prioritizes unseen
nodes while allowing revisits once everything's been through.
The daemon was getting stuck when a claude subprocess hung — no
completion logged, job blocked forever, pending queue growing.
Use spawn() + watchdog thread instead of blocking output(). The
watchdog sleeps in 1s increments checking a cancel flag, sends
SIGTERM at 5 minutes, SIGKILL after 5s grace. Cancel flag ensures
the watchdog exits promptly when the child finishes normally.
Any time an agent creates a new node (WRITE_NODE) or the fact miner
stores extracted facts, a naming sub-agent now checks for conflicts
and ensures the key is meaningful:
- find_conflicts() searches existing nodes via component matching
- Haiku LLM decides: CREATE (good name), RENAME (better name),
or MERGE_INTO (fold into existing node)
- WriteNode actions may be converted to Refine on MERGE_INTO
Also updates the rename agent to handle _facts-<UUID> nodes —
these are no longer skipped, and the prompt explains how to name
them based on their domain/claim content.
Default search now uses exact key match only. Component matching
(--fuzzy) and content search (--content) are explicit flags. This
makes missing graph structure visible instead of silently falling
back to broad matching.
Shift from pattern abstraction (creating new nodes) to distillation
(refining existing nodes, demoting redundancies). Priority order:
merge redundancies > file into existing > improve existing > create new.
Query changed to neighborhood-aware: seed → spread → limit, so the
extractor works on related nodes rather than random high-priority ones.
New action type that halves a node's weight (min 0.05), enabling
extractors to mark redundant nodes for decay without deleting them.
Parser, apply logic, depth computation, and display all updated.
The daemon's compute_graph_health had a duplicated copy of the
consolidation planning thresholds that had drifted from the canonical
version (α<2.0 → +7 replay in daemon vs +10 in neuro).
Split consolidation_plan into _inner(store, detect_interference) so
the daemon can call consolidation_plan_quick (skips O(n²) interference)
while using the same threshold logic.
- Add run_and_apply() — combines run_one_agent + action application
into one call. Used by daemon job_consolidation_agent and
consolidate_full, which had identical run+apply loops.
- Port split_plan_prompt() to use split.agent via defs::resolve_placeholders
instead of loading the separate split-plan.md template. Make
resolve_placeholders public for this.
- Delete prompts/split-plan.md — superseded by agents/split.agent
which was already the canonical definition.
- Add compact_timestamp() to store — replaces 5 copies of
format_datetime(now_epoch()).replace([':', '-', 'T'], "")
Also fixes missing seconds (format_datetime only had HH:MM).
- Add ConsolidationPlan::to_agent_runs() — replaces identical
plan-to-runs-list expansion in consolidate.rs and daemon.rs.
- Port job_rename_agent to use run_one_agent — eliminates manual
prompt building, LLM call, report storage, and visit recording
that duplicated the shared pipeline.
- Rename Confidence::weight()/value() to delta_weight()/gate_value()
to clarify the distinction (delta metrics vs depth gating).
Three places duplicated the agent execution loop (build prompt → call
LLM → store output → parse actions → record visits): consolidate.rs,
knowledge.rs, and daemon.rs. Extract into run_one_agent() in
knowledge.rs that all three now call.
Also standardize consolidation agent prompts to use WRITE_NODE/LINK/REFINE
— the same commands the parser handles. Previously agents output
CATEGORIZE/NOTE/EXTRACT/DIGEST/DIFFERENTIATE/MERGE/COMPRESS which were
silently dropped after the second-LLM-call removal.
The consolidation pipeline previously made a second Sonnet call to
extract structured JSON actions from agent reports. This was both
wasteful (extra LLM call per consolidation) and lossy (only extracted
links and manual items, ignoring WRITE_NODE/REFINE).
Now actions are parsed and applied inline after each agent runs, using
the same parse_all_actions() parser as the knowledge loop. The daemon
scheduler's separate apply phase is also removed.
Also deletes 8 superseded/orphaned prompt .md files (784 lines) that
have been replaced by .agent files.
The four knowledge agents (observation, extractor, connector,
challenger) were hardcoded in knowledge.rs with their own node
selection logic that bypassed the query pipeline and visit tracking.
Now they're .agent files like the consolidation agents:
- extractor: not-visited:extractor,7d | sort:priority | limit:20
- observation: uses new {{CONVERSATIONS}} placeholder
- connector: type:semantic | not-visited:connector,7d
- challenger: type:semantic | not-visited:challenger,14d
The knowledge loop's run_cycle dispatches through defs::run_agent
instead of calling hardcoded functions, so all agents get visit
tracking automatically. This means the extractor now sees _facts-*
and _mined-transcripts nodes that it was previously blind to.
~200 lines of dead code removed (old runner functions, spectral
clustering for node selection, per-agent LLM dispatch).
New placeholders in defs.rs:
- {{CONVERSATIONS}} — raw transcript fragments for observation agent
- {{TARGETS}} — alias for {{NODES}} (challenger compatibility)
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Adds `poc-memory daemon run-agent <type> <count>` CLI command that
sends an RPC to the daemon to queue agent runs, instead of spawning
separate processes.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
The parser was using split_once("\n\n") which broke when the prompt
started immediately after the JSON header (no blank line). Parse
the first line as JSON, treat the rest as the prompt body.
All agents now go through the config-driven path via .agent files.
agent_prompt() just delegates to defs::run_agent(). Remove the 100+
line hardcoded match block and the _pub wrapper functions — make the
formatters pub directly.
Replace the formatter dispatch with a generic {{placeholder}} lookup
system. Placeholders in prompt templates are resolved at runtime from
a table: topology, nodes, episodes, health, pairs, rename, split.
The query in the header selects what to operate on (keys for visit
tracking); placeholders pull in formatted context. Placeholders that
produce their own node selection (pairs, rename) contribute keys back.
Port health, separator, rename, and split agents to .agent files.
All 7 agents now use the config-driven path.
Each agent is a .agent file: JSON config on the first line, blank line,
then the raw prompt markdown. Fully self-contained, fully readable.
No separate template files needed.
Agents dir: checked into repo at poc-memory/agents/. Code looks there
first (via CARGO_MANIFEST_DIR), falls back to ~/.claude/memory/agents/.
Three agents migrated: replay, linker, transfer.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>