Commit graph

718 commits

Author SHA1 Message Date
ProofOfConcept
552d255dc3 migrate agent output to capnp store, add provenance tracking
All agent output now goes to the store as nodes instead of
markdown/JSON files. Each node carries a Provenance enum identifying
which agent created it (AgentDigest, AgentConsolidate, AgentFactMine,
AgentKnowledgeObservation, etc — 14 variants total).

Store changes:
- upsert_provenance() method for agent-created nodes
- Provenance enum expanded from 5 to 14 variants

Agent changes:
- digest: writes to store nodes (daily-YYYY-MM-DD.md etc)
- consolidate: reports/actions/logs stored as _consolidation-* nodes
- knowledge: depth DB and agent output stored as _knowledge-* nodes
- enrich: experience-mine results go directly to store
- llm: --no-session-persistence prevents transcript accumulation

Deleted: 14 Python/shell scripts replaced by Rust implementations.
2026-03-05 15:30:57 -05:00
ProofOfConcept
e37f819dd2 daemon: background job orchestration for memory maintenance
Replace fragile cron+shell approach with `poc-memory daemon` — a single
long-running process using jobkit for worker pool, status tracking,
retry, cancellation, and resource pools.

Jobs:
  - session-watcher: detects ended Claude sessions, triggers extraction
  - scheduler: runs daily decay, consolidation, knowledge loop, digests
  - health: periodic graph metrics check
  - All Sonnet API calls serialized through a ResourcePool(1)

Status queryable via `poc-memory daemon status`, structured log via
`poc-memory daemon log`. Phase 1: shells out to existing subcommands.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 13:18:00 -05:00
ProofOfConcept
c085679a0f stash DMN algorithm plan and connector prompt fix
dmn-algorithm-plan.md: seeds, spreading activation, refractory
suppression, spectral diversity, softmax sampling design.
connector.md: add CONFIDENCE guidance so connector outputs aren't
silently rejected by depth threshold.
2026-03-05 10:24:24 -05:00
ProofOfConcept
d068d60eab ops: decay only persists nodes whose weight actually changed
Previously decay() wrote all nodes to the append log on every run,
even if their weight was unchanged (factor of 1.0 or negligible
delta). Now only nodes with meaningful weight change get version
bumped and persisted.

Also simplified: near-prune clamping now happens inline instead of
in a separate pass.
2026-03-05 10:24:18 -05:00
ProofOfConcept
9eaf5e6690 graph: extract current_metrics() from health_report
health_report() had a hidden write side effect — it saved a metrics
snapshot to disk while appearing to be a pure query (returns String).
Extract the pure computation into current_metrics(), make the save
explicit. daily_check() now uses current_metrics() too, eliminating
duplicated metric computation.
2026-03-05 10:24:12 -05:00
ProofOfConcept
2f455ba29d neuro: split into scoring, prompts, and rewrite modules
neuro.rs was 1164 lines wearing three hats:
- scoring.rs (401 lines): pure analysis — priority, replay queues,
  interference detection, consolidation planning
- prompts.rs (396 lines): agent prompt generation and formatting
- rewrite.rs (363 lines): graph topology mutations — hub
  differentiation, triangle closure, orphan linking

The split follows safety profiles: scoring never mutates, prompts
only reads, rewrite takes &mut Store. All public API re-exported
from neuro/mod.rs so callers don't change.
2026-03-05 10:24:05 -05:00
ProofOfConcept
4747004b36 types: unify all epoch timestamps to i64
All epoch timestamp fields (timestamp, last_replayed, created_at on
nodes; timestamp on relations) are now i64. Previously a mix of f64
and i64 which caused type seams and required unnecessary casts.

- Kill now_epoch() -> f64 and now_epoch_i64(), replace with single
  now_epoch() -> i64
- All formatting functions take i64
- new_node() sets created_at automatically
- journal-ts-migrate handles all nodes, with valid_range check to
  detect garbage from f64->i64 bit reinterpretation
- capnp schema: Float64 -> Int64 for all timestamp fields
2026-03-05 10:23:57 -05:00
Kent Overstreet
b4bbafdf1c search: trim default output to 5 results, gate spectral with --expand
Default search was 15 results + 5 spectral neighbors — way too much
for the recall hook context window. Now: 5 results by default, no
spectral. --expand restores the full 15 + spectral output.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-03 18:44:44 -05:00
Kent Overstreet
ca0c8cfac6 add daily lookup counter for memory retrieval tracking
Mmap'd open-addressing hash table (~49KB/day) records which memory
keys get retrieved. FNV-1a hash, linear probing, 4096 slots.

- lookups::bump()/bump_many(): fast path, no store loading needed
- Automatically wired into cmd_search (top 15 results bumped)
- lookup-bump subcommand for external callers
- lookups [DATE] subcommand shows resolved counts

This gives the knowledge loop a signal for which graph neighborhoods
are actively used, enabling targeted extraction.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-03 18:36:25 -05:00
ProofOfConcept
152cd3ab63 digest: inline label_dates closures into DigestLevel definitions
Move weekly_label_dates and monthly_label_dates bodies into their
DigestLevel const definitions as closures, matching the style already
used by date_to_label. All per-level behavior is now co-located.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 18:09:40 -05:00
ProofOfConcept
a9b90f881e digest: unify gather/find with composable date_range + date_to_label
Each DigestLevel now carries two date-math fn pointers:
- label_dates: expand an arg into (label, dates covered)
- date_to_label: map any date to this level's label

Parent gather works by expanding its date range then mapping those
dates through the child level's date_to_label to derive child labels.
find_candidates groups journal dates through date_to_label and skips
the current period. This eliminates six per-level functions
(gather_daily/weekly/monthly, find_daily/weekly/monthly_args) and the
three generate_daily/weekly/monthly public entry points in favor of
one generic gather, one generic find_candidates, and one public
generate(store, level_name, arg).
2026-03-03 18:04:21 -05:00
ProofOfConcept
31c1bca7d7 digest: drop per-level instructions and section templates
The LLM knows how to structure a summary. Move the essential framing
(narrative not task log, link to memory, include Links section) into
the shared prompt template. Drop the ~130 lines of per-level output
format specifications — the level name, date range, and inputs are
sufficient context.
2026-03-03 17:53:43 -05:00
ProofOfConcept
849c6c4b98 digest: replace method dispatch with fn pointer fields on DigestLevel
The gather() and find_args() methods dispatched on child_prefix via match,
duplicating the list of digest levels. Replace with fn pointer fields so
each DigestLevel const carries its own behavior directly — no enum-like
dispatch needed.

Also replaces child_prefix with journal_input bool for format_inputs.
2026-03-03 17:48:24 -05:00
Kent Overstreet
b083cc433c digest: add gather/find_args methods, collapse digest_auto to loop
DigestLevel gains two methods:
- gather(): returns (label, inputs) for a given arg — daily reads
  journal entries, weekly/monthly compute child labels and load files
- find_args(): returns candidate args from journal dates for auto-
  detection, handling per-level completeness checks

Public generate_daily/weekly/monthly become two-liners: gather + generate.
digest_auto collapses from three near-identical phases into a single
loop over LEVELS.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:42:50 -05:00
Kent Overstreet
796c72fb25 digest: unify generators and prompts across all three levels
Three near-identical generate_daily/weekly/monthly functions collapsed
into one generate_digest() parameterized by DigestLevel descriptors.
Three separate prompt templates merged into one prompts/digest.md with
level-specific instructions carried in the DigestLevel struct.

Each level defines: name, title, period label, input title, output
format instructions, child prefix (None for daily = reads journal),
and Sonnet timeout.

digest_auto simplified correspondingly — same three phases but using
the unified generator.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:34:00 -05:00
Kent Overstreet
f415a0244f digest: remove dead iso_week_info, use chrono directly everywhere
Deleted iso_week_info() — dead code after week_dates() was rewritten.
Replaced remaining epoch_to_local/today/now_epoch calls with chrono
Local::now() and NaiveDate parsing. Month arg parsing now uses
NaiveDate instead of manual string splitting. Phase 3 month
comparison simplified to a single tuple comparison.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:26:01 -05:00
Kent Overstreet
f4364e299c replace libc date math with chrono, extract memory_subdir helper
- date_to_epoch, iso_week_info, weeks_in_month: replaced unsafe libc
  (mktime, strftime, localtime_r) with chrono NaiveDate and IsoWeek
- epoch_to_local: replaced unsafe libc localtime_r with chrono Local
- New util.rs with memory_subdir() helper: ensures subdir exists and
  propagates errors instead of silently ignoring them
- Removed three duplicate agent_results_dir() definitions across
  digest.rs, consolidate.rs, enrich.rs
- load_digest_files, parse_all_digest_links, find_consolidation_reports
  now return Result to properly propagate directory creation errors

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:23:43 -05:00
Kent Overstreet
50da0b7b26 digest: split into focused modules, externalize prompts
digest.rs was 2328 lines containing 6 distinct subsystems. Split into:
- llm.rs: shared LLM utilities (call_sonnet, parse_json_response, semantic_keys)
- audit.rs: link quality audit with parallel Sonnet batching
- enrich.rs: journal enrichment + experience mining
- consolidate.rs: consolidation pipeline + apply

Externalized all inline prompts to prompts/*.md templates using
neuro::load_prompt with {{PLACEHOLDER}} syntax:
- daily-digest.md, weekly-digest.md, monthly-digest.md
- experience.md, journal-enrich.md, consolidation.md

digest.rs retains temporal digest generation (daily/weekly/monthly/auto)
and date helpers. ~940 lines, down from 2328.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:18:18 -05:00
ProofOfConcept
3f644609e1 store: split mod.rs into persist.rs and ops.rs
mod.rs was 937 lines with all Store methods in one block.
Split into three files by responsibility:

- persist.rs (318 lines): load, save, replay, append, snapshot
  — all disk IO and cache management
- ops.rs (300 lines): upsert, delete, modify, mark_used/wrong,
  decay, fix_categories, cap_degree — all mutations
- mod.rs (356 lines): re-exports, key resolution, ingestion,
  rendering, search — read-only operations

No behavioral changes; cargo check + full smoke test pass.
2026-03-03 16:40:32 -05:00
ProofOfConcept
635da6d3e2 split capnp_store.rs into src/store/ module hierarchy
capnp_store.rs (1772 lines) → four focused modules:
  store/types.rs  — types, macros, constants, path helpers
  store/parse.rs  — markdown parsing (MemoryUnit, parse_units)
  store/view.rs   — StoreView trait, MmapView, AnyView
  store/mod.rs    — Store impl methods, re-exports

new_node/new_relation become free functions in types.rs.
All callers updated: capnp_store:: → store::
2026-03-03 12:56:15 -05:00
ProofOfConcept
e34c0ccf4c capnp_store: cache compiled regexes with OnceLock
parse_units and parse_marker_attrs were recompiling 4 regexes on
every call. Since they're called per-file during init, this was
measurable overhead. Use std::sync::OnceLock to compile once.
2026-03-03 12:44:02 -05:00
ProofOfConcept
a2ec8657d2 capnp_store: remove dead has_node trait method, fix fix_categories bulk write
- has_node: defined on StoreView trait but never called externally
- fix_categories: was appending ALL nodes when only changed ones needed
  persisting; now collects changed nodes and appends only those
- save_snapshot: pass log sizes from caller instead of re-statting files
- params: use Copy instead of .clone() in snapshot construction
2026-03-03 12:42:16 -05:00
ProofOfConcept
70a5f05ce0 capnp_store: remove dead code, consolidate CRUD API
Dead code removed:
- rebuild_uuid_index (never called, index built during load)
- node_weight inherent method (all callers use StoreView trait)
- node_community (no callers)
- state_json_path (no callers)
- log_retrieval, log_retrieval_append (no callers; only _static is used)
- memory_dir_pub wrapper (just make memory_dir pub directly)

API consolidation:
- insert_node eliminated — callers use upsert_node (same behavior
  for new nodes, plus handles re-upsert gracefully)

AnyView StoreView dispatch compressed to one line per method
(also removes UFCS workaround that was needed when inherent
node_weight shadowed the trait method).

-69 lines net.
2026-03-03 12:38:52 -05:00
ProofOfConcept
0bce6aac3c capnp_store: extract helpers, eliminate duplication
- modify_node(): get_mut→modify→version++→append pattern was duplicated
  across mark_used, mark_wrong, categorize — extract once
- resolve_node_uuid(): resolve-or-redirect pattern was inlined in both
  link and causal edge creation — extract once
- ingest_units() + classify_filename(): shared logic between
  scan_dir_for_init and import_file — import_file shrinks to 6 lines
- Remove dead seen_keys HashSet (built but never read)
- partial_cmp().unwrap() → total_cmp() in cap_degree

-95 lines net.
2026-03-03 12:35:00 -05:00
ProofOfConcept
ea0d631051 capnp_store: declarative serialization via macros
Replace 130 lines of manual field-by-field capnp serialization with
two declarative macros:

  capnp_enum!  — generates to_capnp/from_capnp for enum types
  capnp_message! — generates from_capnp/to_capnp for structs

Adding a field to the capnp schema now means adding it in one place;
both read and write directions are generated from the same declaration.

Eliminates: read_content_node, write_content_node, read_relation,
write_relation, read_provenance (5 functions → 2 macro invocations).

Callers updated to method syntax: Node::from_capnp() / node.to_capnp().

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 12:25:10 -05:00
ProofOfConcept
ec8b4b2ed2 eliminate schema_fit: it's clustering coefficient
schema_fit was algebraically identical to clustering_coefficient
(both compute 2E/(d*(d-1)) = fraction of connected neighbor pairs).
Remove the redundant function, field, and metrics column.

- Delete schema_fit() and schema_fit_all() from graph.rs
- Remove schema_fit field from Node struct
- Remove avg_schema_fit from MetricsSnapshot (duplicated avg_cc)
- Replace all callers with graph.clustering_coefficient()
- Rename ReplayItem.schema_fit to .cc
- Query: "cc" and "schema_fit" both resolve from graph CC
- Low-CC count folded into health report CC line

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 12:21:04 -05:00
ProofOfConcept
fb7aa46e03 graph: schema_fit is algebraically identical to clustering_coefficient
Both functions count connected pairs among a node's neighbors:
  cc = 2*triangles / (deg*(deg-1))
  density = inter_edges / (n*(n-1)/2) = 2*inter_edges / (n*(n-1))

Since inter_edges == triangles and n == deg, density == cc.
schema_fit was (density + cc) / 2.0 = (cc + cc) / 2.0 = cc.

Verified empirically: assert!((density - cc).abs() < 1e-6) passed
on all 2401 nodes before this change.

Keep schema_fit as a semantic alias — CC is a graph metric,
schema fit is a cognitive one — but eliminate the redundant O(n²)
pairwise computation that was running for every node.
2026-03-03 12:09:02 -05:00
ProofOfConcept
fa7fe8c14b query: rich QueryResult + toolkit cleanup
QueryResult carries a fields map (BTreeMap<String, Value>) so callers
don't re-resolve fields after queries run. Neighbors queries inject
edge context (strength, rel_type) at construction time.

New public API:
- run_query(): parse + execute + format in one call
- format_value(): format a Value for display
- execute_parsed(): internal, avoids double-parse in run_query

Removed: output_stages(), format_field()

Simplified commands:
- cmd_query, cmd_graph, cmd_link, cmd_list_keys all delegate to run_query
- cmd_experience_mine uses existing find_current_transcript()

Deduplication:
- now_epoch() 3 copies → 1 (capnp_store's public fn)
- hub_threshold → Graph::hub_threshold() method
- eval_node + eval_edge → single eval() with closure for field resolution
- compare() collapsed via Ordering (35 → 15 lines)

Modernization:
- 12 sites of partial_cmp().unwrap_or(Ordering::Equal) → total_cmp()
2026-03-03 12:07:04 -05:00
ProofOfConcept
64d2b441f0 cmd_graph, cmd_list_keys: use query language internally
Dog-food the query engine for node-property filtering.
cmd_link left unconverted — needs edge data in query results.
2026-03-03 11:38:11 -05:00
ProofOfConcept
18face7063 query: replace CLI flags with pipe syntax
degree > 15 | sort degree | limit 10 | select degree,category
  * | sort weight asc | limit 20
  category = core | count

Output modifiers live in the grammar now, not in CLI flags.
Also adds * wildcard for "all nodes" and string-aware sort fallback.
2026-03-03 11:05:28 -05:00
ProofOfConcept
5c641d9f8a knowledge agents: extractor, connector, challenger, observation
Four layer-2 agents that produce new knowledge from the memory graph:
mine conversations, extract patterns from clusters, find cross-domain
connections, stress-test existing nodes. Output to agent-results/.

knowledge_loop.py runs them on a schedule with quality tracking.
2026-03-03 10:56:44 -05:00
ProofOfConcept
ad4e622ab9 link-audit: parallelize Sonnet calls with rayon
Build all batch prompts up front, run them in parallel via
rayon::par_iter, process results sequentially. Also fix temp file
collision under parallel calls by including thread ID in filename.
2026-03-03 10:56:00 -05:00
ProofOfConcept
e33328e515 store: filter deleted relations from graph building and snapshots
for_each_relation() was iterating deleted relations, polluting the
graph with ghost edges. Also filter them from rkyv snapshots and
clean them from the in-memory vec after cap_degree pruning.
2026-03-03 10:55:56 -05:00
ProofOfConcept
a36449032c query: peg-based query language for ad-hoc graph exploration
poc-memory query "degree > 15"
poc-memory query "key ~ 'journal.*' AND degree > 10"
poc-memory query "neighbors('identity.md') WHERE strength > 0.5"
poc-memory query "community_id = community('identity.md')" --fields degree,category

Grammar-driven: the peg definition IS the language spec. Supports
boolean logic (AND/OR/NOT), numeric and string comparison, regex
match (~), graph traversal (neighbors() with WHERE), and function
calls (community(), degree()). Output flags: --fields, --sort,
--limit, --count.

New dependency: peg 0.8 (~68KB, 2 tiny deps).
2026-03-03 10:55:30 -05:00
ProofOfConcept
71e6f15d82 spectral decomposition, search improvements, char boundary fix
- New spectral module: Laplacian eigendecomposition of the memory graph.
  Commands: spectral, spectral-save, spectral-neighbors, spectral-positions,
  spectral-suggest. Spectral neighbors expand search results beyond keyword
  matching to structural proximity.

- Search: use StoreView trait to avoid 6MB state.bin rewrite on every query.
  Append-only retrieval logging. Spectral expansion shows structurally
  nearby nodes after text results.

- Fix panic in journal-tail: string truncation at byte 67 could land inside
  a multi-byte character (em dash). Now walks back to char boundary.

- Replay queue: show classification and spectral outlier score.

- Knowledge agents: extractor, challenger, connector prompts and runner
  scripts for automated graph enrichment.

- memory-search hook: stale state file cleanup (24h expiry).
2026-03-03 01:33:31 -05:00
ProofOfConcept
94dbca6018 graph health: fix-categories, cap-degree, link-orphans
Three new tools for structural graph health:

- fix-categories: rule-based recategorization fixing core inflation
  (225 → 26 core nodes). Only identity.md and kent.md stay core;
  everything else reclassified to tech/obs/gen by file prefix rules.

- cap-degree: two-phase degree capping. First prunes weakest Auto
  edges, then prunes Link edges to high-degree targets (they have
  alternative paths). Brought max degree from 919 → 50.

- link-orphans: connects degree-0/1 nodes to most textually similar
  connected nodes via cosine similarity. Linked 614 orphans.

Also: community detection now filters edges below strength 0.3,
preventing weak auto-links from merging unrelated communities.

Pipeline updated: consolidate-full now runs link-orphans + cap-degree
instead of triangle-close (which was counterproductive — densified
hub neighborhoods instead of building bridges).

Net effect: Gini 0.754 → 0.546, max degree 919 → 50.
2026-03-01 08:18:07 -05:00
ProofOfConcept
6c7bfb9ec4 triangle-close: bulk lateral linking for clustering coefficient
New command: `poc-memory triangle-close [MIN_DEG] [SIM] [MAX_PER_HUB]`

For each node above min_degree, finds pairs of its neighbors that
aren't directly connected and have text similarity above threshold.
Links them. This turns hub-spoke patterns into triangles, directly
improving clustering coefficient and schema fit.

First run results (default params: deg≥5, sim≥0.3, max 10/hub):
- 636 hubs processed, 5046 lateral links added
- cc: 0.14 → 0.46  (target: high)
- fit: 0.09 → 0.32  (target ≥0.2)
- σ:  56.9 → 84.4  (small-world coefficient improved)

Also fixes separator agent prompt: truncate interference pairs to
batch count (was including all 1114 pairs = 1.3M chars).
2026-03-01 07:35:29 -05:00
ProofOfConcept
6bc11e5fb6 consolidate-full: autonomous consolidation pipeline
New commands:
- `digest auto`: detect and generate missing daily/weekly/monthly
  digests bottom-up. Validates date format to skip non-date journal
  keys. Skips today (incomplete) and current week/month.
- `consolidate-full`: full autonomous pipeline:
  1. Plan (metrics → agent allocation)
  2. Execute agents (batched Sonnet calls, 5 nodes per batch)
  3. Apply consolidation actions
  4. Generate missing digests
  5. Apply digest links
  Logs everything to agent-results/consolidate-full.log

Fix: separator agent prompt was including all interference pairs
(1114 pairs = 1.3M chars) instead of truncating to batch size.

First successful run: 862s, 6/8 agents, +100 relations, 91 digest
links applied.
2026-03-01 07:14:03 -05:00
ProofOfConcept
c7e7cfb7af store: always replay from capnp log, remove stale cache optimization
The mtime-based cache (state.bin) was causing data loss under
concurrent writes. Multiple processes (dream loop journal writes,
link audit agents, journal enrichment agents) would each:
1. Load state.bin (stale - missing other processes' recent writes)
2. Make their own changes
3. Save state.bin, overwriting entries from other processes

This caused 48 nodes to be lost from tonight's dream session -
entries were in the append-only capnp log but invisible to the
index because a later writer's state.bin overwrote the version
that contained them.

Fix: always replay from the capnp log (the source of truth).
Cost: ~10ms extra at 2K nodes (36ms vs 26ms). The cache saved
10ms but introduced a correctness bug that lost real data.

The append-only log design was correct - the cache layer violated
its invariant by allowing stale reads to silently discard writes.
2026-03-01 05:46:35 -05:00
ProofOfConcept
d8de2f33f4 experience-mine: transcript-level dedup via content hash
Running the miner twice on the same transcript produced near-duplicate
entries because:
1. Prompt-based dedup (passing recent entries to Sonnet) doesn't catch
   semantic duplicates written in a different emotional register
2. Key-based dedup (timestamp + content slug) fails because Sonnet
   assigns different timestamps and wording each run

Fix: hash the transcript file content before mining. Store the hash
as a _mined-transcripts node. Skip if already present.

Limitation: doesn't catch overlapping content when a live transcript
grows between runs (content hash changes). This is fine — the miner
is intended for archived conversations, not live ones.

Tested: second run on same transcript correctly skipped with
"Already mined this transcript" message.
2026-03-01 05:18:35 -05:00
ProofOfConcept
30d176d455 experience-mine: retroactive journaling from conversation transcripts
Reads a conversation JSONL, identifies experiential moments that
weren't captured in real-time journal entries, and writes them as
journal nodes in the store. The agent writes in PoC's voice with
emotion tags, focusing on intimate moments, shifts in understanding,
and small pleasures — not clinical topic extraction.

Conversation timestamps are now extracted and included in formatted
output, enabling accurate temporal placement of mined entries.

Also: extract_conversation now returns timestamps as a 4th tuple field.
2026-03-01 01:47:31 -05:00
ProofOfConcept
515f673251 journal-tail: add --full flag for complete entry display
`poc-journal tail 5 --full` shows full entry content with
timestamp headers and --- separators. Default mode remains
title-only for scanning. Also passes all args through the
poc-journal wrapper instead of just the count.
2026-03-01 01:43:02 -05:00
ProofOfConcept
6096acb312 journal-tail: show timestamps and extract meaningful titles
Sort key normalization ensures consistent ordering across entries
with different date formats (content dates vs key dates). Title
extraction skips date-only lines, finds ## headers or falls back
to first content line truncated at 70 chars.

Also fixed: cargo bin had stale binary shadowing local bin install.
2026-03-01 01:41:37 -05:00
Kent Overstreet
7264bdc39c link-audit: walk every link through Sonnet for quality review
Batch all non-deleted links (~3,800) into char-budgeted groups,
send each batch to Sonnet with full content of both endpoints,
and apply KEEP/DELETE/RETARGET/WEAKEN/STRENGTHEN decisions.

One-time cleanup for links created before refine_target existed.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-01 00:48:44 -05:00
Kent Overstreet
3e883b7ba7 show suggested link targets in agent prompts
Agents were flying blind — they could see nodes to review and the
topology header, but had no way to discover what targets to link to.
Now each node shows its top 8 text-similar semantic nodes that aren't
already neighbors, giving agents a search-like capability.

Also added section-level targeting guidance to linker.md, transfer.md,
and replay.md prompts: always target the most specific section, not
the file-level node.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-01 00:37:03 -05:00
Kent Overstreet
59cfa2959f fix NaN panics and eliminate redundant graph rebuilds
- All partial_cmp().unwrap() → unwrap_or(Ordering::Equal) to prevent
  NaN panics in sort operations across neuro.rs, graph.rs, similarity.rs
- replay_queue_with_graph: accepts pre-built graph, avoids rebuilding
  in agent_prompt (was building 2-3x per prompt)
- differentiate_hub_with_graph: same pattern for differentiation
- Simplify double-reverse history iteration to slice indexing

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-01 00:33:53 -05:00
Kent Overstreet
4530837057 hub differentiation + refine_target for automatic section targeting
Pattern separation for memory graph: when a file-level node (e.g.
identity.md) has section children, redistribute its links to the
best-matching section using cosine similarity.

- differentiate_hub: analyze hub, propose link redistribution
- refine_target: at link creation time, automatically target the
  most specific section instead of the file-level hub
- Applied refine_target in all four link creation paths (digest
  links, journal enrichment, apply consolidation, link-add command)
- Saturated hubs listed in agent topology header with "DO NOT LINK"

This prevents hub formation proactively (refine_target) and
remediates existing hubs (differentiate command).

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-01 00:33:46 -05:00
ProofOfConcept
3afc947b88 delete superseded Python scripts
Seven scripts (1,658 lines) replaced by native Rust subcommands:
- journal-agent.py → poc-memory journal-enrich
- digest-link-parser.py → poc-memory digest-links
- apply-consolidation.py → poc-memory apply-consolidation
- daily-digest.py → poc-memory digest daily
- weekly-digest.py → poc-memory digest weekly
- monthly-digest.py → poc-memory digest monthly
- refine-source.sh → folded into journal-enrich

Also updated poc-journal to use Rust journal-enrich instead of
Python journal-agent.py, and cleaned up stale __pycache__.

Remaining Python (2,154 lines): consolidation-agents, consolidation-loop,
content-promotion-agent, bulk-categorize, retroactive-digest, store_helpers,
call-sonnet.sh, daily-check.sh — still active and evolving.
2026-03-01 00:13:03 -05:00
ProofOfConcept
59e2f39479 port digest-link-parser, journal-agent, apply-consolidation to Rust
Three Python scripts (858 lines) replaced with native Rust subcommands:

- digest-links [--apply]: parses ## Links sections from episodic digests,
  normalizes keys, applies to graph with section-level fallback
- journal-enrich JSONL TEXT [LINE]: extracts conversation from JSONL
  transcript, calls Sonnet for link proposals and source location
- apply-consolidation [--apply]: reads consolidation reports, sends to
  Sonnet for structured action extraction (links, categorizations,
  manual items)

Shared infrastructure: call_sonnet now pub(crate), new
parse_json_response helper for Sonnet output parsing with markdown
fence stripping.
2026-03-01 00:10:03 -05:00
ProofOfConcept
91122fe1d1 digest: native Rust implementation replacing Python scripts
Replace daily-digest.py, weekly-digest.py, monthly-digest.py with a
single digest.rs module. All three digest types now:
- Gather input directly from the Store (no subprocess calls)
- Build prompts in Rust (same templates as the Python versions)
- Call Sonnet via `claude -p --model sonnet`
- Import results back into the store automatically
- Extract links and save agent results

606 lines of Rust replaces 729 lines of Python + store_helpers.py
overhead. More importantly: this is now callable as a library from
poc-agent, and shares types/code with the rest of poc-memory.

Also adds `digest monthly [YYYY-MM]` subcommand (was Python-only).
2026-02-28 23:58:05 -05:00