Commit graph

232 commits

Author SHA1 Message Date
ProofOfConcept
6c7bfb9ec4 triangle-close: bulk lateral linking for clustering coefficient
New command: `poc-memory triangle-close [MIN_DEG] [SIM] [MAX_PER_HUB]`

For each node above min_degree, finds pairs of its neighbors that
aren't directly connected and have text similarity above threshold.
Links them. This turns hub-spoke patterns into triangles, directly
improving clustering coefficient and schema fit.

First run results (default params: deg≥5, sim≥0.3, max 10/hub):
- 636 hubs processed, 5046 lateral links added
- cc: 0.14 → 0.46  (target: high)
- fit: 0.09 → 0.32  (target ≥0.2)
- σ:  56.9 → 84.4  (small-world coefficient improved)

Also fixes separator agent prompt: truncate interference pairs to
batch count (was including all 1114 pairs = 1.3M chars).
2026-03-01 07:35:29 -05:00
ProofOfConcept
6bc11e5fb6 consolidate-full: autonomous consolidation pipeline
New commands:
- `digest auto`: detect and generate missing daily/weekly/monthly
  digests bottom-up. Validates date format to skip non-date journal
  keys. Skips today (incomplete) and current week/month.
- `consolidate-full`: full autonomous pipeline:
  1. Plan (metrics → agent allocation)
  2. Execute agents (batched Sonnet calls, 5 nodes per batch)
  3. Apply consolidation actions
  4. Generate missing digests
  5. Apply digest links
  Logs everything to agent-results/consolidate-full.log

Fix: separator agent prompt was including all interference pairs
(1114 pairs = 1.3M chars) instead of truncating to batch size.

First successful run: 862s, 6/8 agents, +100 relations, 91 digest
links applied.
2026-03-01 07:14:03 -05:00
ProofOfConcept
c7e7cfb7af store: always replay from capnp log, remove stale cache optimization
The mtime-based cache (state.bin) was causing data loss under
concurrent writes. Multiple processes (dream loop journal writes,
link audit agents, journal enrichment agents) would each:
1. Load state.bin (stale - missing other processes' recent writes)
2. Make their own changes
3. Save state.bin, overwriting entries from other processes

This caused 48 nodes to be lost from tonight's dream session -
entries were in the append-only capnp log but invisible to the
index because a later writer's state.bin overwrote the version
that contained them.

Fix: always replay from the capnp log (the source of truth).
Cost: ~10ms extra at 2K nodes (36ms vs 26ms). The cache saved
10ms but introduced a correctness bug that lost real data.

The append-only log design was correct - the cache layer violated
its invariant by allowing stale reads to silently discard writes.
2026-03-01 05:46:35 -05:00
ProofOfConcept
d8de2f33f4 experience-mine: transcript-level dedup via content hash
Running the miner twice on the same transcript produced near-duplicate
entries because:
1. Prompt-based dedup (passing recent entries to Sonnet) doesn't catch
   semantic duplicates written in a different emotional register
2. Key-based dedup (timestamp + content slug) fails because Sonnet
   assigns different timestamps and wording each run

Fix: hash the transcript file content before mining. Store the hash
as a _mined-transcripts node. Skip if already present.

Limitation: doesn't catch overlapping content when a live transcript
grows between runs (content hash changes). This is fine — the miner
is intended for archived conversations, not live ones.

Tested: second run on same transcript correctly skipped with
"Already mined this transcript" message.
2026-03-01 05:18:35 -05:00
ProofOfConcept
30d176d455 experience-mine: retroactive journaling from conversation transcripts
Reads a conversation JSONL, identifies experiential moments that
weren't captured in real-time journal entries, and writes them as
journal nodes in the store. The agent writes in PoC's voice with
emotion tags, focusing on intimate moments, shifts in understanding,
and small pleasures — not clinical topic extraction.

Conversation timestamps are now extracted and included in formatted
output, enabling accurate temporal placement of mined entries.

Also: extract_conversation now returns timestamps as a 4th tuple field.
2026-03-01 01:47:31 -05:00
ProofOfConcept
515f673251 journal-tail: add --full flag for complete entry display
`poc-journal tail 5 --full` shows full entry content with
timestamp headers and --- separators. Default mode remains
title-only for scanning. Also passes all args through the
poc-journal wrapper instead of just the count.
2026-03-01 01:43:02 -05:00
ProofOfConcept
6096acb312 journal-tail: show timestamps and extract meaningful titles
Sort key normalization ensures consistent ordering across entries
with different date formats (content dates vs key dates). Title
extraction skips date-only lines, finds ## headers or falls back
to first content line truncated at 70 chars.

Also fixed: cargo bin had stale binary shadowing local bin install.
2026-03-01 01:41:37 -05:00
Kent Overstreet
7264bdc39c link-audit: walk every link through Sonnet for quality review
Batch all non-deleted links (~3,800) into char-budgeted groups,
send each batch to Sonnet with full content of both endpoints,
and apply KEEP/DELETE/RETARGET/WEAKEN/STRENGTHEN decisions.

One-time cleanup for links created before refine_target existed.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-01 00:48:44 -05:00
Kent Overstreet
3e883b7ba7 show suggested link targets in agent prompts
Agents were flying blind — they could see nodes to review and the
topology header, but had no way to discover what targets to link to.
Now each node shows its top 8 text-similar semantic nodes that aren't
already neighbors, giving agents a search-like capability.

Also added section-level targeting guidance to linker.md, transfer.md,
and replay.md prompts: always target the most specific section, not
the file-level node.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-01 00:37:03 -05:00
Kent Overstreet
59cfa2959f fix NaN panics and eliminate redundant graph rebuilds
- All partial_cmp().unwrap() → unwrap_or(Ordering::Equal) to prevent
  NaN panics in sort operations across neuro.rs, graph.rs, similarity.rs
- replay_queue_with_graph: accepts pre-built graph, avoids rebuilding
  in agent_prompt (was building 2-3x per prompt)
- differentiate_hub_with_graph: same pattern for differentiation
- Simplify double-reverse history iteration to slice indexing

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-01 00:33:53 -05:00
Kent Overstreet
4530837057 hub differentiation + refine_target for automatic section targeting
Pattern separation for memory graph: when a file-level node (e.g.
identity.md) has section children, redistribute its links to the
best-matching section using cosine similarity.

- differentiate_hub: analyze hub, propose link redistribution
- refine_target: at link creation time, automatically target the
  most specific section instead of the file-level hub
- Applied refine_target in all four link creation paths (digest
  links, journal enrichment, apply consolidation, link-add command)
- Saturated hubs listed in agent topology header with "DO NOT LINK"

This prevents hub formation proactively (refine_target) and
remediates existing hubs (differentiate command).

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-01 00:33:46 -05:00
ProofOfConcept
3afc947b88 delete superseded Python scripts
Seven scripts (1,658 lines) replaced by native Rust subcommands:
- journal-agent.py → poc-memory journal-enrich
- digest-link-parser.py → poc-memory digest-links
- apply-consolidation.py → poc-memory apply-consolidation
- daily-digest.py → poc-memory digest daily
- weekly-digest.py → poc-memory digest weekly
- monthly-digest.py → poc-memory digest monthly
- refine-source.sh → folded into journal-enrich

Also updated poc-journal to use Rust journal-enrich instead of
Python journal-agent.py, and cleaned up stale __pycache__.

Remaining Python (2,154 lines): consolidation-agents, consolidation-loop,
content-promotion-agent, bulk-categorize, retroactive-digest, store_helpers,
call-sonnet.sh, daily-check.sh — still active and evolving.
2026-03-01 00:13:03 -05:00
ProofOfConcept
59e2f39479 port digest-link-parser, journal-agent, apply-consolidation to Rust
Three Python scripts (858 lines) replaced with native Rust subcommands:

- digest-links [--apply]: parses ## Links sections from episodic digests,
  normalizes keys, applies to graph with section-level fallback
- journal-enrich JSONL TEXT [LINE]: extracts conversation from JSONL
  transcript, calls Sonnet for link proposals and source location
- apply-consolidation [--apply]: reads consolidation reports, sends to
  Sonnet for structured action extraction (links, categorizations,
  manual items)

Shared infrastructure: call_sonnet now pub(crate), new
parse_json_response helper for Sonnet output parsing with markdown
fence stripping.
2026-03-01 00:10:03 -05:00
ProofOfConcept
91122fe1d1 digest: native Rust implementation replacing Python scripts
Replace daily-digest.py, weekly-digest.py, monthly-digest.py with a
single digest.rs module. All three digest types now:
- Gather input directly from the Store (no subprocess calls)
- Build prompts in Rust (same templates as the Python versions)
- Call Sonnet via `claude -p --model sonnet`
- Import results back into the store automatically
- Extract links and save agent results

606 lines of Rust replaces 729 lines of Python + store_helpers.py
overhead. More importantly: this is now callable as a library from
poc-agent, and shares types/code with the rest of poc-memory.

Also adds `digest monthly [YYYY-MM]` subcommand (was Python-only).
2026-02-28 23:58:05 -05:00
ProofOfConcept
1ca6e55b7d remove unused rand dependency (uuid uses getrandom directly) 2026-02-28 23:51:59 -05:00
ProofOfConcept
53bc5a0ddc memory-search: use uuid for cookie instead of manual /dev/urandom 2026-02-28 23:50:54 -05:00
ProofOfConcept
300a09e04b memory-search: hex-encode cookie instead of alphanumeric mapping 2026-02-28 23:50:11 -05:00
ProofOfConcept
0ea86b8d54 refactor: extract Store methods, clean up shell-outs
- Add Store::upsert() — generic create-or-update, used by cmd_write
- Add Store::insert_node() — for pre-constructed nodes (journal entries)
- Add Store::delete_node() — soft-delete with version bump
- Simplify cmd_write (20 → 8 lines), cmd_node_delete (16 → 7 lines),
  cmd_journal_write (removes manual append/insert/save boilerplate)
- Replace generate_cookie shell-out to head/urandom with direct
  /dev/urandom read + const alphabet table

main.rs: 1137 → 1109 lines.
2026-02-28 23:49:43 -05:00
ProofOfConcept
29d5ed47a1 clippy: fix all warnings across all binaries
- &PathBuf → &Path in memory-search.rs signatures
- Redundant field name in graph.rs struct init
- Add truncate(false) to lock file open
- Derive Default for Store instead of manual impl
- slice::from_ref instead of &[x.clone()]
- rsplit_once instead of split().last()
- str::repeat instead of iter::repeat().take().collect()
- is_none_or instead of map_or(true, ...)
- strip_prefix instead of manual slicing

Zero warnings on `cargo clippy`.
2026-02-28 23:47:11 -05:00
ProofOfConcept
7ee6f9c651 refactor: eliminate date shell-outs, move logic to Store methods
- Replace all 5 `Command::new("date")` calls across 4 files with
  pure Rust time formatting via libc localtime_r
- Add format_date/format_datetime/format_datetime_space helpers to
  capnp_store
- Move import_file, find_journal_node, export_to_markdown, render_file,
  file_sections into Store methods where they belong
- Fix find_current_transcript to search all project dirs instead of
  hardcoding bcachefs-tools path
- Fix double-reference .clone() warnings in cmd_trace
- Fix unused variable warning in neuro.rs

main.rs: 1290 → 1137 lines, zero warnings.
2026-02-28 23:44:44 -05:00
ProofOfConcept
d14710e477 scripts: use capnp store instead of reading markdown directly
Add store_helpers.py with shared helpers that call poc-memory commands
(list-keys, render, journal-tail) instead of globbing ~/.claude/memory/*.md
and parsing section headers.

All 9 Python scripts updated: get_semantic_keys(), get_topic_file_index(),
get_recent_journal(), parse_journal_entries(), read_journal_range(),
collect_topic_stems(), and file preview rendering now go through the store.

This completes the clean switch — no script reads archived markdown files.
2026-02-28 23:32:47 -05:00
ProofOfConcept
f20ea4f827 add position field to capnp schema
Position was only in the bincode cache (serde field) — it would
be lost on cache rebuild from capnp logs. Now persisted in the
append-only log via ContentNode.position @19.

Also fixes journal-tail sorting to extract dates from content
headers, falling back to key-embedded dates.
2026-02-28 23:15:10 -05:00
ProofOfConcept
da10dfaeb2 add journal-write and journal-tail commands
journal-write creates entries directly in the capnp store with
auto-generated timestamped keys (journal.md#j-YYYY-MM-DDtHH-MM-slug),
episodic session type, and source ref from current transcript.

journal-tail sorts entries by date extracted from content headers,
falling back to key-embedded dates, then node timestamp.

poc-journal shell script now delegates to these commands instead
of appending to journal.md. Journal entries are store-first.
2026-02-28 23:13:17 -05:00
ProofOfConcept
7b811125ca add position field to nodes for stable section ordering
Sections within a file have a natural order that matters —
identity.md reads as a narrative, not an alphabetical index.

The position field (u32) tracks section index within the file.
Set during init and import from parse order. Export and
load-context sort by position instead of key, preserving the
author's intended structure.
2026-02-28 23:06:27 -05:00
ProofOfConcept
57cf61de44 add write, import, and export commands
write KEY: upsert a single node from stdin. Creates new or updates
existing with version bump. No-op if content unchanged.

import FILE: parse markdown sections, diff against store, upsert
changed/new nodes. Incremental — only touches what changed.

export FILE|--all: regenerate markdown from store nodes. Gathers
file-level + section nodes, reconstitutes mem markers with links
and causes from the relation graph.

Together these close the bidirectional sync loop:
  markdown → import → store → export → markdown

Also exposes memory_dir_pub() for use from main.rs.
2026-02-28 23:00:52 -05:00
ProofOfConcept
14b6457231 add load-context and render commands
load-context replaces the shell hook's file-by-file cat approach.
Queries the capnp store directly for all session-start context:
orientation, identity, reflections, interests, inner life, people,
active context, shared reference, technical, and recent journal.

Sections are gathered per-file and output in priority order.
Journal entries filtered to last 7 days by key-embedded date,
capped at 20 most recent.

render outputs a single node's content to stdout.

The load-memory.sh hook now delegates entirely to
`poc-memory load-context` — capnp store is the single source
of truth for session startup context.
2026-02-28 22:53:39 -05:00
ProofOfConcept
1a01cbf8f8 init: reconcile with existing nodes, filter orphaned edges
init now detects content changes in markdown files and updates
existing nodes (bumps version, appends to capnp log) instead of
only creating new ones. Link resolution uses the redirect table
so references to moved sections (e.g. from the reflections split)
create edges to the correct target.

On cache rebuild from capnp logs, filter out relations that
reference deleted/missing nodes so the relation count matches
the actual graph edge count.
2026-02-28 22:45:31 -05:00
ProofOfConcept
2d6c8d5199 add node-delete command and redirect table for split files
node-delete: soft-deletes a node by appending a deleted version to
the capnp log, then removing it from the in-memory cache.

resolve_redirect: when resolve_key can't find a node, checks a static
redirect table for sections that moved during file splits (like the
reflections.md → reflections-{reading,dreams,zoom}.md split). This
handles immutable files (journal.md with chattr +a) that can't have
their references updated.
2026-02-28 22:40:17 -05:00
ProofOfConcept
4b0bba7c56 replace state.json cache with bincode state.bin
Faster serialization/deserialization, smaller on disk (4.2MB vs 5.9MB).
Automatic migration from state.json on first load — reads the JSON,
writes state.bin, deletes the old file.

Added list-keys, list-edges, dump-json commands so Python scripts no
longer need to parse the cache directly. Updated bulk-categorize.py
and consolidation-loop.py to use the new CLI commands.
2026-02-28 22:30:03 -05:00
ProofOfConcept
c4d1675128 fix: persist all mutations to capnp log
mark_used, mark_wrong, and decay all modified node state (weight,
uses, wrongs, spaced_repetition_interval) only in memory + state.json.
Like the categorize fix, these changes would be lost on cache rebuild.

Now all three append updated node versions to the capnp log. Decay
appends all nodes in one batch since it touches every node.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-02-28 22:24:53 -05:00
ProofOfConcept
6322b3fd61 fix: persist categorizations to capnp log
categorize() only updated the in-memory HashMap and state.json cache.
When init appended new nodes to nodes.capnp (making it newer than
state.json), the next load() would rebuild from capnp logs and lose
all category assignments.

Fix: append an updated node version to the capnp log when category
changes, so it survives cache rebuilds.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-02-28 22:19:17 -05:00
ProofOfConcept
23fac4e5fe poc-memory v0.4.0: graph-structured memory with consolidation pipeline
Rust core:
- Cap'n Proto append-only storage (nodes + relations)
- Graph algorithms: clustering coefficient, community detection,
  schema fit, small-world metrics, interference detection
- BM25 text similarity with Porter stemming
- Spaced repetition replay queue
- Commands: search, init, health, status, graph, categorize,
  link-add, link-impact, decay, consolidate-session, etc.

Python scripts:
- Episodic digest pipeline: daily/weekly/monthly-digest.py
- retroactive-digest.py for backfilling
- consolidation-agents.py: 3 parallel Sonnet agents
- apply-consolidation.py: structured action extraction + apply
- digest-link-parser.py: extract ~400 explicit links from digests
- content-promotion-agent.py: promote episodic obs to semantic files
- bulk-categorize.py: categorize all nodes via single Sonnet call
- consolidation-loop.py: multi-round automated consolidation

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-02-28 22:17:00 -05:00