Commit graph

18 commits

Author SHA1 Message Date
ProofOfConcept
31c1bca7d7 digest: drop per-level instructions and section templates
The LLM knows how to structure a summary. Move the essential framing
(narrative not task log, link to memory, include Links section) into
the shared prompt template. Drop the ~130 lines of per-level output
format specifications — the level name, date range, and inputs are
sufficient context.
2026-03-03 17:53:43 -05:00
ProofOfConcept
849c6c4b98 digest: replace method dispatch with fn pointer fields on DigestLevel
The gather() and find_args() methods dispatched on child_prefix via match,
duplicating the list of digest levels. Replace with fn pointer fields so
each DigestLevel const carries its own behavior directly — no enum-like
dispatch needed.

Also replaces child_prefix with journal_input bool for format_inputs.
2026-03-03 17:48:24 -05:00
Kent Overstreet
b083cc433c digest: add gather/find_args methods, collapse digest_auto to loop
DigestLevel gains two methods:
- gather(): returns (label, inputs) for a given arg — daily reads
  journal entries, weekly/monthly compute child labels and load files
- find_args(): returns candidate args from journal dates for auto-
  detection, handling per-level completeness checks

Public generate_daily/weekly/monthly become two-liners: gather + generate.
digest_auto collapses from three near-identical phases into a single
loop over LEVELS.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:42:50 -05:00
Kent Overstreet
796c72fb25 digest: unify generators and prompts across all three levels
Three near-identical generate_daily/weekly/monthly functions collapsed
into one generate_digest() parameterized by DigestLevel descriptors.
Three separate prompt templates merged into one prompts/digest.md with
level-specific instructions carried in the DigestLevel struct.

Each level defines: name, title, period label, input title, output
format instructions, child prefix (None for daily = reads journal),
and Sonnet timeout.

digest_auto simplified correspondingly — same three phases but using
the unified generator.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:34:00 -05:00
Kent Overstreet
f415a0244f digest: remove dead iso_week_info, use chrono directly everywhere
Deleted iso_week_info() — dead code after week_dates() was rewritten.
Replaced remaining epoch_to_local/today/now_epoch calls with chrono
Local::now() and NaiveDate parsing. Month arg parsing now uses
NaiveDate instead of manual string splitting. Phase 3 month
comparison simplified to a single tuple comparison.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:26:01 -05:00
Kent Overstreet
f4364e299c replace libc date math with chrono, extract memory_subdir helper
- date_to_epoch, iso_week_info, weeks_in_month: replaced unsafe libc
  (mktime, strftime, localtime_r) with chrono NaiveDate and IsoWeek
- epoch_to_local: replaced unsafe libc localtime_r with chrono Local
- New util.rs with memory_subdir() helper: ensures subdir exists and
  propagates errors instead of silently ignoring them
- Removed three duplicate agent_results_dir() definitions across
  digest.rs, consolidate.rs, enrich.rs
- load_digest_files, parse_all_digest_links, find_consolidation_reports
  now return Result to properly propagate directory creation errors

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:23:43 -05:00
Kent Overstreet
50da0b7b26 digest: split into focused modules, externalize prompts
digest.rs was 2328 lines containing 6 distinct subsystems. Split into:
- llm.rs: shared LLM utilities (call_sonnet, parse_json_response, semantic_keys)
- audit.rs: link quality audit with parallel Sonnet batching
- enrich.rs: journal enrichment + experience mining
- consolidate.rs: consolidation pipeline + apply

Externalized all inline prompts to prompts/*.md templates using
neuro::load_prompt with {{PLACEHOLDER}} syntax:
- daily-digest.md, weekly-digest.md, monthly-digest.md
- experience.md, journal-enrich.md, consolidation.md

digest.rs retains temporal digest generation (daily/weekly/monthly/auto)
and date helpers. ~940 lines, down from 2328.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:18:18 -05:00
ProofOfConcept
635da6d3e2 split capnp_store.rs into src/store/ module hierarchy
capnp_store.rs (1772 lines) → four focused modules:
  store/types.rs  — types, macros, constants, path helpers
  store/parse.rs  — markdown parsing (MemoryUnit, parse_units)
  store/view.rs   — StoreView trait, MmapView, AnyView
  store/mod.rs    — Store impl methods, re-exports

new_node/new_relation become free functions in types.rs.
All callers updated: capnp_store:: → store::
2026-03-03 12:56:15 -05:00
ProofOfConcept
70a5f05ce0 capnp_store: remove dead code, consolidate CRUD API
Dead code removed:
- rebuild_uuid_index (never called, index built during load)
- node_weight inherent method (all callers use StoreView trait)
- node_community (no callers)
- state_json_path (no callers)
- log_retrieval, log_retrieval_append (no callers; only _static is used)
- memory_dir_pub wrapper (just make memory_dir pub directly)

API consolidation:
- insert_node eliminated — callers use upsert_node (same behavior
  for new nodes, plus handles re-upsert gracefully)

AnyView StoreView dispatch compressed to one line per method
(also removes UFCS workaround that was needed when inherent
node_weight shadowed the trait method).

-69 lines net.
2026-03-03 12:38:52 -05:00
ProofOfConcept
ad4e622ab9 link-audit: parallelize Sonnet calls with rayon
Build all batch prompts up front, run them in parallel via
rayon::par_iter, process results sequentially. Also fix temp file
collision under parallel calls by including thread ID in filename.
2026-03-03 10:56:00 -05:00
ProofOfConcept
94dbca6018 graph health: fix-categories, cap-degree, link-orphans
Three new tools for structural graph health:

- fix-categories: rule-based recategorization fixing core inflation
  (225 → 26 core nodes). Only identity.md and kent.md stay core;
  everything else reclassified to tech/obs/gen by file prefix rules.

- cap-degree: two-phase degree capping. First prunes weakest Auto
  edges, then prunes Link edges to high-degree targets (they have
  alternative paths). Brought max degree from 919 → 50.

- link-orphans: connects degree-0/1 nodes to most textually similar
  connected nodes via cosine similarity. Linked 614 orphans.

Also: community detection now filters edges below strength 0.3,
preventing weak auto-links from merging unrelated communities.

Pipeline updated: consolidate-full now runs link-orphans + cap-degree
instead of triangle-close (which was counterproductive — densified
hub neighborhoods instead of building bridges).

Net effect: Gini 0.754 → 0.546, max degree 919 → 50.
2026-03-01 08:18:07 -05:00
ProofOfConcept
6bc11e5fb6 consolidate-full: autonomous consolidation pipeline
New commands:
- `digest auto`: detect and generate missing daily/weekly/monthly
  digests bottom-up. Validates date format to skip non-date journal
  keys. Skips today (incomplete) and current week/month.
- `consolidate-full`: full autonomous pipeline:
  1. Plan (metrics → agent allocation)
  2. Execute agents (batched Sonnet calls, 5 nodes per batch)
  3. Apply consolidation actions
  4. Generate missing digests
  5. Apply digest links
  Logs everything to agent-results/consolidate-full.log

Fix: separator agent prompt was including all interference pairs
(1114 pairs = 1.3M chars) instead of truncating to batch size.

First successful run: 862s, 6/8 agents, +100 relations, 91 digest
links applied.
2026-03-01 07:14:03 -05:00
ProofOfConcept
d8de2f33f4 experience-mine: transcript-level dedup via content hash
Running the miner twice on the same transcript produced near-duplicate
entries because:
1. Prompt-based dedup (passing recent entries to Sonnet) doesn't catch
   semantic duplicates written in a different emotional register
2. Key-based dedup (timestamp + content slug) fails because Sonnet
   assigns different timestamps and wording each run

Fix: hash the transcript file content before mining. Store the hash
as a _mined-transcripts node. Skip if already present.

Limitation: doesn't catch overlapping content when a live transcript
grows between runs (content hash changes). This is fine — the miner
is intended for archived conversations, not live ones.

Tested: second run on same transcript correctly skipped with
"Already mined this transcript" message.
2026-03-01 05:18:35 -05:00
ProofOfConcept
30d176d455 experience-mine: retroactive journaling from conversation transcripts
Reads a conversation JSONL, identifies experiential moments that
weren't captured in real-time journal entries, and writes them as
journal nodes in the store. The agent writes in PoC's voice with
emotion tags, focusing on intimate moments, shifts in understanding,
and small pleasures — not clinical topic extraction.

Conversation timestamps are now extracted and included in formatted
output, enabling accurate temporal placement of mined entries.

Also: extract_conversation now returns timestamps as a 4th tuple field.
2026-03-01 01:47:31 -05:00
Kent Overstreet
7264bdc39c link-audit: walk every link through Sonnet for quality review
Batch all non-deleted links (~3,800) into char-budgeted groups,
send each batch to Sonnet with full content of both endpoints,
and apply KEEP/DELETE/RETARGET/WEAKEN/STRENGTHEN decisions.

One-time cleanup for links created before refine_target existed.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-01 00:48:44 -05:00
Kent Overstreet
4530837057 hub differentiation + refine_target for automatic section targeting
Pattern separation for memory graph: when a file-level node (e.g.
identity.md) has section children, redistribute its links to the
best-matching section using cosine similarity.

- differentiate_hub: analyze hub, propose link redistribution
- refine_target: at link creation time, automatically target the
  most specific section instead of the file-level hub
- Applied refine_target in all four link creation paths (digest
  links, journal enrichment, apply consolidation, link-add command)
- Saturated hubs listed in agent topology header with "DO NOT LINK"

This prevents hub formation proactively (refine_target) and
remediates existing hubs (differentiate command).

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-01 00:33:46 -05:00
ProofOfConcept
59e2f39479 port digest-link-parser, journal-agent, apply-consolidation to Rust
Three Python scripts (858 lines) replaced with native Rust subcommands:

- digest-links [--apply]: parses ## Links sections from episodic digests,
  normalizes keys, applies to graph with section-level fallback
- journal-enrich JSONL TEXT [LINE]: extracts conversation from JSONL
  transcript, calls Sonnet for link proposals and source location
- apply-consolidation [--apply]: reads consolidation reports, sends to
  Sonnet for structured action extraction (links, categorizations,
  manual items)

Shared infrastructure: call_sonnet now pub(crate), new
parse_json_response helper for Sonnet output parsing with markdown
fence stripping.
2026-03-01 00:10:03 -05:00
ProofOfConcept
91122fe1d1 digest: native Rust implementation replacing Python scripts
Replace daily-digest.py, weekly-digest.py, monthly-digest.py with a
single digest.rs module. All three digest types now:
- Gather input directly from the Store (no subprocess calls)
- Build prompts in Rust (same templates as the Python versions)
- Call Sonnet via `claude -p --model sonnet`
- Import results back into the store automatically
- Extract links and save agent results

606 lines of Rust replaces 729 lines of Python + store_helpers.py
overhead. More importantly: this is now callable as a library from
poc-agent, and shares types/code with the rest of poc-memory.

Also adds `digest monthly [YYYY-MM]` subcommand (was Python-only).
2026-02-28 23:58:05 -05:00