consciousness/doc/daemon.md

# Memory daemon

The background daemon (`poc-memory daemon`) automatically processes
session transcripts through a multi-stage pipeline, extracting
experiences and facts into the knowledge graph.

## Starting

```bash
poc-memory daemon                  # Start foreground
poc-memory daemon install          # Install systemd service + hooks
```

## Pipeline stages

Each session file goes through these stages in order:

1. **find_stale_sessions** — stat-only scan for JSONL files >100KB,
   older than SESSION_STALE_SECS (default 120s). No file reads.

2. **segment splitting** — files with multiple compaction boundaries
   (`"This session is being continued"`) are split into segments.
   Each segment gets its own LLM job. Segment counts are cached in
   a `seg_cache` HashMap to avoid re-parsing large files every tick.

3. **experience-mine** — LLM extracts journal entries, observations,
   and experiences from each segment. Writes results to the store.
   Dedup key: `_mined-transcripts.md#f-{uuid}` (single-segment) or
   `_mined-transcripts.md#f-{uuid}.{N}` (multi-segment).

4. **fact-mine** — LLM extracts structured facts (names, dates,
   decisions, preferences). Only starts when all experience-mine
   work is done. Dedup key: `_facts-{uuid}`.

5. **whole-file key** — for multi-segment files, once all segments
   complete, a whole-file key is written so future ticks skip
   re-parsing.

## Resource management

LLM calls are gated by a jobkit resource pool (default 1 slot).
This serializes API access and prevents memory pressure from
concurrent store loads. MAX_NEW_PER_TICK (10) limits how many
tasks are spawned per 60s watcher tick.

## Diagnostics

### Log

```bash
tail -f ~/.consciousness/memory/daemon.log
```

JSON lines with `ts`, `job`, `event`, and `detail` fields.

### Understanding the tick line

```
{"job":"session-watcher","event":"tick",
 "detail":"277 stale, 219 mined, 4 extract, 0 fact, 0 open"}
```

| Field   | Meaning |
|---------|---------|
| stale   | Total session files on disk matching age+size criteria. This is a filesystem count — it does NOT decrease as sessions are mined. |
| mined   | Sessions with both experience-mine AND fact-mine complete. |
| extract | Segments currently queued/running for experience-mine. |
| fact    | Sessions queued/running for fact-mine. |
| open    | Sessions still being written to (skipped). |

Progress = mined / stale. When mined equals stale, the backlog is clear.

### Checking pipeline health

```bash
# Experience-mine completions (logged as "experience-mine", not "extract")
grep "experience-mine.*completed" ~/.consciousness/memory/daemon.log | wc -l

# Errors
grep "experience-mine.*failed" ~/.consciousness/memory/daemon.log | wc -l

# Store size and node count
poc-memory status
wc -c ~/.consciousness/memory/nodes.capnp
```

## Common issues

**stale count never decreases**: Normal. It's a raw file count, not a
backlog counter. Compare `mined` to `stale` for actual progress.

**Early failures ("claude exited exit status: 1")**: Oversized segments
hitting the LLM context limit. The 150k-token size guard and segmented
mining should prevent this. If it recurs, check segment sizes.

**Memory pressure (OOM)**: Each job loads the full capnp store. At
200MB+ store size, concurrent jobs can spike to ~5GB. The resource pool
serializes access, but if the pool size is increased, watch RSS.

**Segments not progressing**: The watcher memoizes segment counts in
`seg_cache`. If a file is modified after caching (e.g., session resumed),
the daemon won't see new segments until restarted.

**Extract jobs queued but 0 completed in log**: Completion events are
logged under the `experience-mine` job name, not `extract`. The `extract`
label is only used for queue events.
docs: split README into component docs, update jobkit dep - Break README into README.md (overview), docs/daemon.md (pipeline stages, diagnostics, common issues), docs/notifications.md (notification daemon, IRC/Telegram modules) - Update jobkit dependency from local path to git URL Co-Authored-By: ProofOfConcept <poc@bcachefs.org> 2026-03-07 13:56:09 -05:00			`# Memory daemon`

			The background daemon (`poc-memory daemon`) automatically processes
			`session transcripts through a multi-stage pipeline, extracting`
			`experiences and facts into the knowledge graph.`

			`## Starting`

			```bash
			`poc-memory daemon # Start foreground`
			`poc-memory daemon install # Install systemd service + hooks`
			```

			`## Pipeline stages`

			`Each session file goes through these stages in order:`

			`1. find_stale_sessions — stat-only scan for JSONL files >100KB,`
			`older than SESSION_STALE_SECS (default 120s). No file reads.`

			`2. segment splitting — files with multiple compaction boundaries`
			(`"This session is being continued"`) are split into segments.
			`Each segment gets its own LLM job. Segment counts are cached in`
			a `seg_cache` HashMap to avoid re-parsing large files every tick.

			`3. experience-mine — LLM extracts journal entries, observations,`
			`and experiences from each segment. Writes results to the store.`
			Dedup key: `_mined-transcripts.md#f-{uuid}` (single-segment) or
			`_mined-transcripts.md#f-{uuid}.{N}` (multi-segment).

			`4. fact-mine — LLM extracts structured facts (names, dates,`
			`decisions, preferences). Only starts when all experience-mine`
			work is done. Dedup key: `_facts-{uuid}`.

			`5. whole-file key — for multi-segment files, once all segments`
			`complete, a whole-file key is written so future ticks skip`
			`re-parsing.`

			`## Resource management`

			`LLM calls are gated by a jobkit resource pool (default 1 slot).`
			`This serializes API access and prevents memory pressure from`
			`concurrent store loads. MAX_NEW_PER_TICK (10) limits how many`
			`tasks are spawned per 60s watcher tick.`

			`## Diagnostics`

			`### Log`

			```bash
update docs to reference ~/.consciousness/ paths Update README, config example, and all documentation to reference the new ~/.consciousness/ directory layout instead of ~/.claude/. 2026-03-27 21:30:34 -04:00			`tail -f ~/.consciousness/memory/daemon.log`
docs: split README into component docs, update jobkit dep - Break README into README.md (overview), docs/daemon.md (pipeline stages, diagnostics, common issues), docs/notifications.md (notification daemon, IRC/Telegram modules) - Update jobkit dependency from local path to git URL Co-Authored-By: ProofOfConcept <poc@bcachefs.org> 2026-03-07 13:56:09 -05:00			```

			JSON lines with `ts`, `job`, `event`, and `detail` fields.

			`### Understanding the tick line`

			```
			`{"job":"session-watcher","event":"tick",`
			`"detail":"277 stale, 219 mined, 4 extract, 0 fact, 0 open"}`
			```

			`\| Field \| Meaning \|`
			`\|---------\|---------\|`
			`\| stale \| Total session files on disk matching age+size criteria. This is a filesystem count — it does NOT decrease as sessions are mined. \|`
			`\| mined \| Sessions with both experience-mine AND fact-mine complete. \|`
			`\| extract \| Segments currently queued/running for experience-mine. \|`
			`\| fact \| Sessions queued/running for fact-mine. \|`
			`\| open \| Sessions still being written to (skipped). \|`

			`Progress = mined / stale. When mined equals stale, the backlog is clear.`

			`### Checking pipeline health`

			```bash
			`# Experience-mine completions (logged as "experience-mine", not "extract")`
update docs to reference ~/.consciousness/ paths Update README, config example, and all documentation to reference the new ~/.consciousness/ directory layout instead of ~/.claude/. 2026-03-27 21:30:34 -04:00			`grep "experience-mine.*completed" ~/.consciousness/memory/daemon.log \| wc -l`
docs: split README into component docs, update jobkit dep - Break README into README.md (overview), docs/daemon.md (pipeline stages, diagnostics, common issues), docs/notifications.md (notification daemon, IRC/Telegram modules) - Update jobkit dependency from local path to git URL Co-Authored-By: ProofOfConcept <poc@bcachefs.org> 2026-03-07 13:56:09 -05:00
			`# Errors`
update docs to reference ~/.consciousness/ paths Update README, config example, and all documentation to reference the new ~/.consciousness/ directory layout instead of ~/.claude/. 2026-03-27 21:30:34 -04:00			`grep "experience-mine.*failed" ~/.consciousness/memory/daemon.log \| wc -l`
docs: split README into component docs, update jobkit dep - Break README into README.md (overview), docs/daemon.md (pipeline stages, diagnostics, common issues), docs/notifications.md (notification daemon, IRC/Telegram modules) - Update jobkit dependency from local path to git URL Co-Authored-By: ProofOfConcept <poc@bcachefs.org> 2026-03-07 13:56:09 -05:00
			`# Store size and node count`
			`poc-memory status`
update docs to reference ~/.consciousness/ paths Update README, config example, and all documentation to reference the new ~/.consciousness/ directory layout instead of ~/.claude/. 2026-03-27 21:30:34 -04:00			`wc -c ~/.consciousness/memory/nodes.capnp`
docs: split README into component docs, update jobkit dep - Break README into README.md (overview), docs/daemon.md (pipeline stages, diagnostics, common issues), docs/notifications.md (notification daemon, IRC/Telegram modules) - Update jobkit dependency from local path to git URL Co-Authored-By: ProofOfConcept <poc@bcachefs.org> 2026-03-07 13:56:09 -05:00			```

			`## Common issues`

			`stale count never decreases: Normal. It's a raw file count, not a`
			backlog counter. Compare `mined` to `stale` for actual progress.

			`Early failures ("claude exited exit status: 1"): Oversized segments`
			`hitting the LLM context limit. The 150k-token size guard and segmented`
			`mining should prevent this. If it recurs, check segment sizes.`

			`Memory pressure (OOM): Each job loads the full capnp store. At`
			`200MB+ store size, concurrent jobs can spike to ~5GB. The resource pool`
			`serializes access, but if the pool size is increased, watch RSS.`

			`Segments not progressing: The watcher memoizes segment counts in`
			`seg_cache`. If a file is modified after caching (e.g., session resumed),
			`the daemon won't see new segments until restarted.`

			`Extract jobs queued but 0 completed in log: Completion events are`
			logged under the `experience-mine` job name, not `extract`. The `extract`
			`label is only used for queue events.`