Commit graph

11 commits

Author SHA1 Message Date
Kent Overstreet
804578b977 query by NodeType instead of key prefix
Replace key prefix matching (journal#j-, daily-, weekly-, monthly-)
with NodeType filters (EpisodicSession, EpisodicDaily, EpisodicWeekly,
EpisodicMonthly) for all queries: journal-tail, digest gathering,
digest auto-detection, experience mining dedup, and find_journal_node.

Add EpisodicMonthly to NodeType enum and capnp schema.

Key naming conventions (journal#j-TIMESTAMP-slug, daily-DATE, etc.)
are retained for key generation — the fix is about how we find nodes,
not how we name them.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:14:37 -04:00
Kent Overstreet
46f8fe662e store: strip .md suffix from all keys
Keys were a vestige of the file-based era. resolve_key() added .md
to lookups while upsert() used bare keys, creating phantom duplicate
nodes (the instructions bug: writes went to "instructions", reads
found "instructions.md").

- Remove .md normalization from resolve_key, strip instead
- Update all hardcoded key patterns (journal.md# → journal#, etc)
- Add strip_md_keys() migration to fsck: renames nodes and relations
- Add broken link detection to health report
- Delete redirect table (no longer needed)
- Update config defaults and config.jsonl

Migration: run `poc-memory fsck` to rename existing keys.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 19:41:26 -04:00
ProofOfConcept
45335de220 experience-mine: split oversized sessions at compaction boundaries
Claude Code doesn't create new session files on context compaction —
a single UUID can accumulate 170+ conversations, producing 400MB+
JSONL files that generate 1.3M token prompts.

Split at compaction markers ("This session is being continued..."):
- extract_conversation made pub, split_on_compaction splits messages
- experience_mine takes optional segment index
- daemon watcher parses files, spawns per-segment jobs (.0, .1, .2)
- seg_cache memoizes segment counts across ticks
- per-segment dedup keys; whole-file key when all segments complete
- 150K token guard skips any remaining oversized segments
- char-boundary-safe truncation in enrich.rs and fact_mine.rs

Backwards compatible: unsegmented calls still write content-hash
dedup keys, old whole-file mined keys still recognized.
2026-03-07 12:01:38 -05:00
ProofOfConcept
fca9e58713 enrich: fix dedup keys never written for empty mining results
The early return on line 343 when the LLM found no missed experiences
bypassed the dedup key writes at lines 397-414, despite the comment
saying "even if count == 0, to prevent re-runs." This caused sessions
with nothing to mine to be re-mined every 60s tick indefinitely.

Fix: replace the early return with a conditional print, so the dedup
keys are always written and saved.
2026-03-07 00:09:35 -05:00
ProofOfConcept
841cfe035b enrich: backfill filename dedup key on content-hash hit
Transcripts mined before the filename-key feature was added had
content-hash keys (#h-) but no filename keys (#f-). The daemon's
fast-path check only looks at filename keys, so these sessions were
re-queued every tick, hitting the content-hash dedup (0.0s) but
returning early before writing the filename key — a self-perpetuating
loop burning Sonnet quota on ~560 phantom re-mines per minute.

Fix: when the content-hash dedup fires and no filename key exists,
backfill it before returning.
2026-03-06 23:43:34 -05:00
ProofOfConcept
36cb3b641f enrich: set created_at from event timestamp, not mining time
Experience-mined journal entries were all getting created_at = now(),
causing them to sort by mining time instead of when the event actually
happened. Parse the conversation timestamp and set created_at to the
event time so journal-tail shows correct chronological order.
2026-03-06 22:09:44 -05:00
ProofOfConcept
80bdaab8ee enrich: explicitly filter for text blocks in transcript extraction
Only extract content blocks with "type": "text". Previously relied on
tool_use/tool_result blocks lacking a "text" field, which worked but
was fragile. Now explicitly checks block type.
2026-03-06 21:54:19 -05:00
ProofOfConcept
82b33c449c llm: full per-agent usage logging with prompts and responses
Log every model call to ~/.claude/memory/llm-logs/YYYY-MM-DD.md with
full prompt, response, agent type, model, duration, and status. One
file per day, markdown formatted for easy reading.

Agent types: fact-mine, experience-mine, consolidate, knowledge,
digest, enrich, audit. This gives visibility into what each agent
is doing and whether to adjust prompts or frequency.
2026-03-05 22:52:08 -05:00
ProofOfConcept
552d255dc3 migrate agent output to capnp store, add provenance tracking
All agent output now goes to the store as nodes instead of
markdown/JSON files. Each node carries a Provenance enum identifying
which agent created it (AgentDigest, AgentConsolidate, AgentFactMine,
AgentKnowledgeObservation, etc — 14 variants total).

Store changes:
- upsert_provenance() method for agent-created nodes
- Provenance enum expanded from 5 to 14 variants

Agent changes:
- digest: writes to store nodes (daily-YYYY-MM-DD.md etc)
- consolidate: reports/actions/logs stored as _consolidation-* nodes
- knowledge: depth DB and agent output stored as _knowledge-* nodes
- enrich: experience-mine results go directly to store
- llm: --no-session-persistence prevents transcript accumulation

Deleted: 14 Python/shell scripts replaced by Rust implementations.
2026-03-05 15:30:57 -05:00
Kent Overstreet
f4364e299c replace libc date math with chrono, extract memory_subdir helper
- date_to_epoch, iso_week_info, weeks_in_month: replaced unsafe libc
  (mktime, strftime, localtime_r) with chrono NaiveDate and IsoWeek
- epoch_to_local: replaced unsafe libc localtime_r with chrono Local
- New util.rs with memory_subdir() helper: ensures subdir exists and
  propagates errors instead of silently ignoring them
- Removed three duplicate agent_results_dir() definitions across
  digest.rs, consolidate.rs, enrich.rs
- load_digest_files, parse_all_digest_links, find_consolidation_reports
  now return Result to properly propagate directory creation errors

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:23:43 -05:00
Kent Overstreet
50da0b7b26 digest: split into focused modules, externalize prompts
digest.rs was 2328 lines containing 6 distinct subsystems. Split into:
- llm.rs: shared LLM utilities (call_sonnet, parse_json_response, semantic_keys)
- audit.rs: link quality audit with parallel Sonnet batching
- enrich.rs: journal enrichment + experience mining
- consolidate.rs: consolidation pipeline + apply

Externalized all inline prompts to prompts/*.md templates using
neuro::load_prompt with {{PLACEHOLDER}} syntax:
- daily-digest.md, weekly-digest.md, monthly-digest.md
- experience.md, journal-enrich.md, consolidation.md

digest.rs retains temporal digest generation (daily/weekly/monthly/auto)
and date helpers. ~940 lines, down from 2328.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:18:18 -05:00