consciousness/docs/plan-experience-mine-dedup-fix.md
ProofOfConcept 8eb6308760 experience-mine: per-segment dedup keys, retry backoff
The whole-file dedup key (_mined-transcripts#f-{UUID}) prevented mining
new compaction segments when session files grew. Replace with per-segment
keys (_mined-transcripts#f-{UUID}.{N}) so each segment is tracked
independently.

Changes:
- daemon session-watcher: segment-aware dedup, migrate 272 existing
  whole-file keys to per-segment on restart
- seg_cache with size-based invalidation (re-parse when file grows)
- exponential retry backoff (5min → 30min cap) for failed sessions
- experience_mine(): write per-segment key only, backfill on
  content-hash early return
- fact-mining gated on all per-segment keys existing

Also adds documentation:
- docs/claude-code-transcript-format.md: JSONL transcript format
- docs/plan-experience-mine-dedup-fix.md: design document
2026-03-09 02:27:51 -04:00

4.1 KiB

Fix: experience-mine dedup and retry handling

Problem

  1. Whole-file dedup key prevents mining new segments. When a session is mined, experience_mine() writes _mined-transcripts#f-{UUID} (a whole-file key). If the session later grows (compaction adds segments), the daemon sees the whole-file key and skips it forever. New segments never get mined.

  2. No retry backoff. When claude CLI fails (exit status 1), the session-watcher re-queues the same session every 60s tick. This produces a wall of failures in the log and wastes resources.

Design

Dedup keys: per-segment only

Going forward, dedup keys are per-segment: _mined-transcripts#f-{UUID}.{N} where N is the segment index. No more whole-file keys.

Segment indices are stable — compaction appends new segments, never reorders existing ones. See docs/claude-code-transcript-format.md.

Migration of existing whole-file keys

~276 sessions have whole-file keys (_mined-transcripts#f-{UUID} with no segment suffix) and no per-segment keys. These were mined correctly at the time.

When the session-watcher encounters a whole-file key:

  • Count current segments in the file
  • Write per-segment keys for all current segments (they were covered by the old whole-file key)
  • If the file has grown since (new segments beyond the migrated set), those won't have per-segment keys and will be mined normally

This is a one-time migration per file. After migration, the whole-file key is harmless dead weight — nothing creates new ones.

Retry backoff

The session-watcher tracks failed sessions in a local HashMap<String, (Instant, Duration)> mapping path to (next_retry_after, current_backoff).

  • Initial backoff: 5 minutes
  • Each failure: double the backoff
  • Cap: 30 minutes
  • Resets on daemon restart (map is thread-local, not persisted)

Changes

poc-memory/src/agents/enrich.rs

experience_mine(): stop writing the bare filename key for unsegmented calls. Only write the content-hash key (for the legacy dedup check at the top of the function) and per-segment keys.

Already done — edited earlier in this session.

poc-memory/src/agents/daemon.rs

Session-watcher changes:

  1. Remove whole-file fast path. Delete the is_transcript_mined_with_keys check that short-circuits before segment counting.

  2. Always go through segment-aware path. Every stale session gets segment counting (cached) and per-segment key checks.

  3. Migrate whole-file keys. When we find a whole-file key exists but no per-segment keys: write per-segment keys for all current segments into the store. One-time cost per file, batched into a single store load/save per tick.

  4. seg_cache with size invalidation. Change from HashMap<String, usize> to HashMap<String, (u64, usize)>(file_size, seg_count). When stat shows a different size, evict and re-parse.

  5. Remove mark_transcript_done. Stop writing whole-file keys for fully-mined multi-segment files.

  6. Add retry backoff. HashMap<String, (Instant, Duration)> for tracking failed sessions. Skip sessions whose backoff hasn't expired. On failure (task finishes with error), update the backoff. Exponential from 5min, cap at 30min.

  7. Fact-mining check. Currently fact-mining is gated behind experience_done (the whole-file key). After removing the whole-file fast path, fact-mining should be gated on "all segments mined" — i.e., all per-segment keys exist for the current segment count.

Manual cleanup after deploy

Delete the dedup keys for sessions that failed repeatedly (like 8cebfc0a-bd33-49f1-85a4-1489bdf7050c) so they get re-processed:

poc-memory delete-node '_mined-transcripts#f-8cebfc0a-bd33-49f1-85a4-1489bdf7050c'
# also any content-hash key for the same file

Verification

After deploying:

  • tail -f ~/.claude/memory/daemon.log | grep session-watcher should show ticks with migration activity, then settle to idle
  • Failed sessions should show increasing backoff intervals, not per-second retries
  • After fixing the claude CLI issue, backed-off sessions should retry and succeed on the next daemon restart