# Fix: experience-mine dedup and retry handling ## Problem 1. **Whole-file dedup key prevents mining new segments.** When a session is mined, `experience_mine()` writes `_mined-transcripts#f-{UUID}` (a whole-file key). If the session later grows (compaction adds segments), the daemon sees the whole-file key and skips it forever. New segments never get mined. 2. **No retry backoff.** When `claude` CLI fails (exit status 1), the session-watcher re-queues the same session every 60s tick. This produces a wall of failures in the log and wastes resources. ## Design ### Dedup keys: per-segment only Going forward, dedup keys are per-segment: `_mined-transcripts#f-{UUID}.{N}` where N is the segment index. No more whole-file keys. Segment indices are stable — compaction appends new segments, never reorders existing ones. See `docs/claude-code-transcript-format.md`. ### Migration of existing whole-file keys ~276 sessions have whole-file keys (`_mined-transcripts#f-{UUID}` with no segment suffix) and no per-segment keys. These were mined correctly at the time. When the session-watcher encounters a whole-file key: - Count current segments in the file - Write per-segment keys for all current segments (they were covered by the old whole-file key) - If the file has grown since (new segments beyond the migrated set), those won't have per-segment keys and will be mined normally This is a one-time migration per file. After migration, the whole-file key is harmless dead weight — nothing creates new ones. ### Retry backoff The session-watcher tracks failed sessions in a local `HashMap` mapping path to (next_retry_after, current_backoff). - Initial backoff: 5 minutes - Each failure: double the backoff - Cap: 30 minutes - Resets on daemon restart (map is thread-local, not persisted) ## Changes ### `poc-memory/src/agents/enrich.rs` `experience_mine()`: stop writing the bare filename key for unsegmented calls. Only write the content-hash key (for the legacy dedup check at the top of the function) and per-segment keys. **Already done** — edited earlier in this session. ### `poc-memory/src/agents/daemon.rs` Session-watcher changes: 1. **Remove whole-file fast path.** Delete the `is_transcript_mined_with_keys` check that short-circuits before segment counting. 2. **Always go through segment-aware path.** Every stale session gets segment counting (cached) and per-segment key checks. 3. **Migrate whole-file keys.** When we find a whole-file key exists but no per-segment keys: write per-segment keys for all current segments into the store. One-time cost per file, batched into a single store load/save per tick. 4. **seg_cache with size invalidation.** Change from `HashMap` to `HashMap` — `(file_size, seg_count)`. When stat shows a different size, evict and re-parse. 5. **Remove `mark_transcript_done`.** Stop writing whole-file keys for fully-mined multi-segment files. 6. **Add retry backoff.** `HashMap` for tracking failed sessions. Skip sessions whose backoff hasn't expired. On failure (task finishes with error), update the backoff. Exponential from 5min, cap at 30min. 7. **Fact-mining check.** Currently fact-mining is gated behind `experience_done` (the whole-file key). After removing the whole-file fast path, fact-mining should be gated on "all segments mined" — i.e., all per-segment keys exist for the current segment count. ### Manual cleanup after deploy Delete the dedup keys for sessions that failed repeatedly (like `8cebfc0a-bd33-49f1-85a4-1489bdf7050c`) so they get re-processed: ``` poc-memory delete-node '_mined-transcripts#f-8cebfc0a-bd33-49f1-85a4-1489bdf7050c' # also any content-hash key for the same file ``` ## Verification After deploying: - `tail -f ~/.claude/memory/daemon.log | grep session-watcher` should show ticks with migration activity, then settle to idle - Failed sessions should show increasing backoff intervals, not per-second retries - After fixing the `claude` CLI issue, backed-off sessions should retry and succeed on the next daemon restart