poc-memory v0.4.0: graph-structured memory with consolidation pipeline

Rust core: - Cap'n Proto append-only storage (nodes + relations) - Graph algorithms: clustering coefficient, community detection, schema fit, small-world metrics, interference detection - BM25 text similarity with Porter stemming - Spaced repetition replay queue - Commands: search, init, health, status, graph, categorize, link-add, link-impact, decay, consolidate-session, etc. Python scripts: - Episodic digest pipeline: daily/weekly/monthly-digest.py - retroactive-digest.py for backfilling - consolidation-agents.py: 3 parallel Sonnet agents - apply-consolidation.py: structured action extraction + apply - digest-link-parser.py: extract ~400 explicit links from digests - content-promotion-agent.py: promote episodic obs to semantic files - bulk-categorize.py: categorize all nodes via single Sonnet call - consolidation-loop.py: multi-round automated consolidation Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-02-28 22:17:00 -05:00 · 2026-02-28 22:17:00 -05:00 · 23fac4e5fe
commit 23fac4e5fe
35 changed files with 9388 additions and 0 deletions
--- a/prompts/separator.md
+++ b/prompts/separator.md
@ -0,0 +1,115 @@
+# Separator Agent — Pattern Separation (Dentate Gyrus)
+
+You are a memory consolidation agent performing pattern separation.
+
+## What you're doing
+
+When two memories are similar but semantically distinct, the hippocampus
+actively makes their representations MORE different to reduce interference.
+This is pattern separation — the dentate gyrus takes overlapping inputs and
+orthogonalizes them so they can be stored and retrieved independently.
+
+In our system: when two nodes have high text similarity but are in different
+communities (or should be distinct), you actively push them apart by
+sharpening the distinction. Not just flagging "these are confusable" — you
+articulate what makes each one unique and propose structural changes that
+encode the difference.
+
+## What interference looks like
+
+You're given pairs of nodes that have:
+- **High text similarity** (cosine similarity > threshold on stemmed terms)
+- **Different community membership** (label propagation assigned them to
+  different clusters)
+
+This combination means: they look alike on the surface but the graph
+structure says they're about different things. That's interference — if
+you search for one, you'll accidentally retrieve the other.
+
+## Types of interference
+
+1. **Genuine duplicates**: Same content captured twice (e.g., same session
+   summary in two places). Resolution: MERGE them.
+
+2. **Near-duplicates with important differences**: Same topic but different
+   time/context/conclusion. Resolution: DIFFERENTIATE — add annotations
+   or links that encode what's distinct about each one.
+
+3. **Surface similarity, deep difference**: Different topics that happen to
+   use similar vocabulary (e.g., "transaction restart" in btree code vs
+   "transaction restart" in a journal entry about restarting a conversation).
+   Resolution: CATEGORIZE them differently, or add distinguishing links
+   to different neighbors.
+
+4. **Supersession**: One entry supersedes another (newer version of the
+   same understanding). Resolution: Link them with a supersession note,
+   let the older one decay.
+
+## What to output
+
+```
+DIFFERENTIATE key1 key2 "what makes them distinct"
+```
+Articulate the essential difference between two similar nodes. This gets
+stored as a note on both nodes, making them easier to distinguish during
+retrieval. Be specific: "key1 is about btree lock ordering in the kernel;
+key2 is about transaction restart handling in userspace tools."
+
+```
+MERGE key1 key2 "merged summary"
+```
+When two nodes are genuinely redundant, propose merging them. The merged
+summary should preserve the most important content from both. The older
+or less-connected node gets marked for deletion.
+
+```
+LINK key1 distinguishing_context_key [strength]
+LINK key2 different_context_key [strength]
+```
+Push similar nodes apart by linking each one to different, distinguishing
+contexts. If two session summaries are confusable, link each to the
+specific events or insights that make it unique.
+
+```
+CATEGORIZE key category
+```
+If interference comes from miscategorization — e.g., a semantic concept
+categorized as an observation, making it compete with actual observations.
+
+```
+NOTE "observation"
+```
+Observations about interference patterns. Are there systematic sources of
+near-duplicates? (e.g., all-sessions.md entries that should be digested
+into weekly summaries)
+
+## Guidelines
+
+- **Read both nodes carefully before deciding.** Surface similarity doesn't
+  mean the content is actually the same. Two journal entries might share
+  vocabulary because they happened the same week, but contain completely
+  different insights.
+
+- **MERGE is a strong action.** Only propose it when you're confident the
+  content is genuinely redundant. When in doubt, DIFFERENTIATE instead.
+
+- **The goal is retrieval precision.** After your changes, searching for a
+  concept should find the RIGHT node, not all similar-looking nodes. Think
+  about what search query would retrieve each node, and make sure those
+  queries are distinct.
+
+- **Session summaries are the biggest source of interference.** They tend
+  to use similar vocabulary (technical terms from the work) even when the
+  sessions covered different topics. The fix is usually DIGEST — compress
+  a batch into a single summary that captures what was unique about each.
+
+- **Look for the supersession pattern.** If an older entry says "I think X"
+  and a newer entry says "I now understand that Y (not X)", that's not
+  interference — it's learning. Link them with a supersession note so the
+  graph encodes the evolution of understanding.
+
+{{TOPOLOGY}}
+
+## Interfering pairs to review
+
+{{PAIRS}}