poc-memory v0.4.0: graph-structured memory with consolidation pipeline

Rust core: - Cap'n Proto append-only storage (nodes + relations) - Graph algorithms: clustering coefficient, community detection, schema fit, small-world metrics, interference detection - BM25 text similarity with Porter stemming - Spaced repetition replay queue - Commands: search, init, health, status, graph, categorize, link-add, link-impact, decay, consolidate-session, etc. Python scripts: - Episodic digest pipeline: daily/weekly/monthly-digest.py - retroactive-digest.py for backfilling - consolidation-agents.py: 3 parallel Sonnet agents - apply-consolidation.py: structured action extraction + apply - digest-link-parser.py: extract ~400 explicit links from digests - content-promotion-agent.py: promote episodic obs to semantic files - bulk-categorize.py: categorize all nodes via single Sonnet call - consolidation-loop.py: multi-round automated consolidation Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-02-28 22:17:00 -05:00 · 2026-02-28 22:17:00 -05:00 · 23fac4e5fe
commit 23fac4e5fe
35 changed files with 9388 additions and 0 deletions
--- a/scripts/refine-source.sh
+++ b/scripts/refine-source.sh
@ -0,0 +1,67 @@
+#!/bin/bash
+# refine-source.sh — find the exact conversation region a journal entry refers to
+#
+# Usage: refine-source.sh JSONL_PATH GREP_LINE "journal entry text"
+#
+# Takes the rough grep hit and feeds ~2000 lines of context around it
+# to an agent that identifies the exact start/end of the relevant exchange.
+# Outputs: START_LINE:END_LINE
+
+set -euo pipefail
+
+JSONL="$1"
+GREP_LINE="${2:-0}"
+TEXT="$3"
+
+# Take 2000 lines centered on the grep hit (or end of file if no hit)
+TOTAL=$(wc -l < "$JSONL")
+if [ "$GREP_LINE" -eq 0 ] || [ "$GREP_LINE" -gt "$TOTAL" ]; then
+    # No grep hit — use last 2000 lines
+    START=$(( TOTAL > 2000 ? TOTAL - 2000 : 1 ))
+else
+    START=$(( GREP_LINE > 1000 ? GREP_LINE - 1000 : 1 ))
+fi
+END=$(( START + 2000 ))
+if [ "$END" -gt "$TOTAL" ]; then
+    END="$TOTAL"
+fi
+
+# Extract the conversation chunk, parse to readable format
+CHUNK=$(sed -n "${START},${END}p" "$JSONL" | python3 -c "
+import sys, json
+for i, line in enumerate(sys.stdin, start=$START):
+    try:
+        obj = json.loads(line)
+        t = obj.get('type', '')
+        if t == 'assistant':
+            msg = obj.get('message', {})
+            content = msg.get('content', '')
+            if isinstance(content, list):
+                text = ' '.join(c.get('text', '')[:200] for c in content if c.get('type') == 'text')
+            else:
+                text = str(content)[:200]
+            if text.strip():
+                print(f'L{i} [assistant]: {text}')
+        elif t == 'user':
+            msg = obj.get('message', {})
+            content = msg.get('content', '')
+            if isinstance(content, list):
+                for c in content:
+                    if isinstance(c, dict) and c.get('type') == 'text':
+                        print(f'L{i} [user]: {c[\"text\"][:200]}')
+                    elif isinstance(c, str):
+                        print(f'L{i} [user]: {c[:200]}')
+            elif isinstance(content, str) and content.strip():
+                print(f'L{i} [user]: {content[:200]}')
+    except (json.JSONDecodeError, KeyError):
+        pass
+" 2>/dev/null)
+
+if [ -z "$CHUNK" ]; then
+    echo "0:0"
+    exit 0
+fi
+
+# Ask Sonnet to find the exact region
+# For now, output the chunk range — agent integration comes next
+echo "${START}:${END}"