Commit graph

692 commits

Author SHA1 Message Date
Kent Overstreet
fc48ac7c7f split into workspace: poc-memory and poc-daemon subcrates
poc-daemon (notification routing, idle timer, IRC, Telegram) was already
fully self-contained with no imports from the poc-memory library. Now it's
a proper separate crate with its own Cargo.toml and capnp schema.

poc-memory retains the store, graph, search, neuro, knowledge, and the
jobkit-based memory maintenance daemon (daemon.rs).

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:43:59 -04:00
Kent Overstreet
488fd5a0aa remove Category from the type system
Category was a manually-assigned label with no remaining functional
purpose (decay was the only behavior it drove, and that's gone).
Remove the enum, its methods, category_counts, the --category search
filter, and all category display. The field remains in the capnp
schema for backwards compatibility but is no longer read or written.

Status and health reports now show NodeType breakdown (semantic,
episodic, daily, weekly, monthly) instead of categories.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:33:03 -04:00
Kent Overstreet
ba30f5b3e4 use config for identity node references
Replace hardcoded "identity" lookups with config.core_nodes so
experience mining and init work with whatever core nodes are
configured, not just a node named "identity".

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:25:09 -04:00
Kent Overstreet
4bc74ca4a2 remove decay, fix_categories, and categorize
Graph-wide decay is the wrong approach — node importance should emerge
from graph topology (degree, centrality, usage patterns), not a global
weight field multiplied by a category-specific factor.

Remove: Store::decay(), Store::categorize(), Store::fix_categories(),
Category::decay_factor(), cmd_decay, cmd_categorize, cmd_fix_categories,
job_decay, and all category assignments at node creation time.

Category remains in the schema as a vestigial field (removing it
requires a capnp migration) but no longer affects behavior.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:22:38 -04:00
Kent Overstreet
804578b977 query by NodeType instead of key prefix
Replace key prefix matching (journal#j-, daily-, weekly-, monthly-)
with NodeType filters (EpisodicSession, EpisodicDaily, EpisodicWeekly,
EpisodicMonthly) for all queries: journal-tail, digest gathering,
digest auto-detection, experience mining dedup, and find_journal_node.

Add EpisodicMonthly to NodeType enum and capnp schema.

Key naming conventions (journal#j-TIMESTAMP-slug, daily-DATE, etc.)
are retained for key generation — the fix is about how we find nodes,
not how we name them.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:14:37 -04:00
Kent Overstreet
fd5591653d remove hardcoded skip lists, prune orphan edges in fsck
All nodes in the store are memory — none should be excluded from
knowledge extraction, search, or graph algorithms by name. Removed
the MEMORY/where-am-i/work-queue/work-state skip lists entirely.
Deleted where-am-i and work-queue nodes from the store (ephemeral
scratchpads that don't belong). Added orphan edge pruning to fsck
so broken links get cleaned up automatically.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:07:07 -04:00
Kent Overstreet
70c0276fa0 stop filtering journal/digest nodes from knowledge and search
Journal and digest nodes are episodic memory — they should participate
in the graph on the same terms as everything else. Remove all
journal#/daily-/weekly-/monthly- skip filters from knowledge
extraction, connector pairs, challenger, semantic keys, and link
candidate selection. Use node_type field instead of key name matching
for episodic/semantic classification.

Operational nodes (MEMORY, where-am-i, work-queue, work-state) are
still filtered — they're system state, not memory.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:02:01 -04:00
Kent Overstreet
b00e09b091 fsck: detect duplicate keys (different UUIDs, same key)
replay_nodes now tracks all UUIDs per key using a temporary multimap.
Warns on duplicates so they can be manually resolved.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 19:45:18 -04:00
Kent Overstreet
e2f3a5a364 daemon: add test-send subcommand, flatten newlines in send_prompt
test-send calls send_prompt() directly for debugging tmux delivery.
Flatten newlines to spaces in literal mode to prevent premature
input submission.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 19:41:32 -04:00
Kent Overstreet
46f8fe662e store: strip .md suffix from all keys
Keys were a vestige of the file-based era. resolve_key() added .md
to lookups while upsert() used bare keys, creating phantom duplicate
nodes (the instructions bug: writes went to "instructions", reads
found "instructions.md").

- Remove .md normalization from resolve_key, strip instead
- Update all hardcoded key patterns (journal.md# → journal#, etc)
- Add strip_md_keys() migration to fsck: renames nodes and relations
- Add broken link detection to health report
- Delete redirect table (no longer needed)
- Update config defaults and config.jsonl

Migration: run `poc-memory fsck` to rename existing keys.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 19:41:26 -04:00
ProofOfConcept
77fc533631 tmux: remove Escape/C-c/C-u clear sequence from send_prompt
The clear sequence (Escape q C-c C-u) was disrupting Claude Code's
input state, causing nudge messages to arrive as blank prompts.
Simplified to just literal text + Enter.
2026-03-08 18:49:30 -04:00
ProofOfConcept
95baba54c0 tmux: use send-keys -l for literal text input
Without -l, tmux send-keys treats spaces as key-name separators,
so multi-word messages like "This is your time" get split into
individual unrecognized key names instead of being typed as text.
This caused idle nudges to arrive as blank messages.
2026-03-08 18:39:47 -04:00
ProofOfConcept
55fdc3dad7 idle: afk command, configurable session timeout, fix block_reason
Add `poc-daemon afk` to immediately mark Kent as away, allowing the
idle timer to fire without waiting for the session active timeout.

Add `poc-daemon session-timeout <secs>` to configure how long after
the last message Kent counts as "present" (default 15min, persisted).

Fix block_reason() to report "kent present" and "in turn" states
that were checked in the tick but not in the diagnostic output.
2026-03-08 18:31:51 -04:00
ProofOfConcept
05e0f1d5be decay: don't bump version for weight-only changes
Decay is metadata, not content. Bumping version caused unnecessary
log churn and premature cache invalidation.

Also disable auto-decay in scheduler — was causing version spam
and premature demotion of useful nodes.
2026-03-08 18:31:40 -04:00
ProofOfConcept
61dd67caf7 experience-mine: harden prompt boundary against transcript injection
Add explicit markers around the conversation transcript so the LLM
treats it as input data rather than instructions to follow.
2026-03-08 18:31:35 -04:00
ProofOfConcept
2aabad4eda fact-mine: progress callbacks, size-sorted queue, fix empty re-queue
Add optional progress callback to mine_transcript/mine_and_store so
the daemon can display per-chunk status. Sort fact-mine queue by file
size so small transcripts drain first. Write empty marker for
transcripts with no facts to avoid re-queuing them.

Also hardens the extraction prompt suffix.
2026-03-08 18:31:31 -04:00
ProofOfConcept
63910e987c fsck: add store integrity check and repair command
Reads each capnp log message sequentially, validates framing and
content. On first corrupt message, truncates to last good position
and removes stale caches so next load replays from repaired log.

Wired up as `poc-memory fsck`.
2026-03-08 18:31:19 -04:00
Kent Overstreet
d12c28ebcd docs: expand README getting started section
Walk through install, init, hooks setup, daemon start, and basic
usage so someone new to the project can get going from the README
alone.
2026-03-07 13:58:19 -05:00
Kent Overstreet
9e6cf3b830 docs: finish splitting README into component docs
README is now just an overview with links. Component docs:
- docs/memory.md: store design, algorithms, config, CLI reference
- docs/hooks.md: Claude Code integration setup
- docs/daemon.md, docs/notifications.md: from previous commit
2026-03-07 13:57:55 -05:00
Kent Overstreet
908f8c9e52 docs: split README into component docs, update jobkit dep
- Break README into README.md (overview), docs/daemon.md (pipeline
  stages, diagnostics, common issues), docs/notifications.md
  (notification daemon, IRC/Telegram modules)
- Update jobkit dependency from local path to git URL

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-07 13:56:09 -05:00
ProofOfConcept
45335de220 experience-mine: split oversized sessions at compaction boundaries
Claude Code doesn't create new session files on context compaction —
a single UUID can accumulate 170+ conversations, producing 400MB+
JSONL files that generate 1.3M token prompts.

Split at compaction markers ("This session is being continued..."):
- extract_conversation made pub, split_on_compaction splits messages
- experience_mine takes optional segment index
- daemon watcher parses files, spawns per-segment jobs (.0, .1, .2)
- seg_cache memoizes segment counts across ticks
- per-segment dedup keys; whole-file key when all segments complete
- 150K token guard skips any remaining oversized segments
- char-boundary-safe truncation in enrich.rs and fact_mine.rs

Backwards compatible: unsegmented calls still write content-hash
dedup keys, old whole-file mined keys still recognized.
2026-03-07 12:01:38 -05:00
ProofOfConcept
22a9fdabdb idle: EWMA activity tracking
Track activity level as an EWMA (exponentially weighted moving average)
driven by turn duration. Long turns (engaged work) produce large boosts;
short turns (bored responses) barely register.

Asymmetric time constants: 60s boost half-life for fast wake-up, 5-minute
decay half-life for gradual wind-down. Self-limiting boost formula
converges toward 0.75 target — can't overshoot.

- Add activity_ewma, turn_start, last_nudge to persisted state
- Boost on handle_response proportional to turn duration
- Decay on every tick and state transition
- Fix kent_present: self-nudge responses (fired=true) don't update
  last_user_msg, so kent_present stays false during autonomous mode
- Nudge only when Kent is away, minimum 15s between nudges
- CLI: `poc-daemon ewma [VALUE]` to query or set
- Status output shows activity percentage
2026-03-07 02:05:27 -05:00
ProofOfConcept
7ea7c78a35 config: add core-practices.md to default context groups 2026-03-07 01:02:54 -05:00
ProofOfConcept
fca9e58713 enrich: fix dedup keys never written for empty mining results
The early return on line 343 when the LLM found no missed experiences
bypassed the dedup key writes at lines 397-414, despite the comment
saying "even if count == 0, to prevent re-runs." This caused sessions
with nothing to mine to be re-mined every 60s tick indefinitely.

Fix: replace the early return with a conditional print, so the dedup
keys are always written and saved.
2026-03-07 00:09:35 -05:00
ProofOfConcept
841cfe035b enrich: backfill filename dedup key on content-hash hit
Transcripts mined before the filename-key feature was added had
content-hash keys (#h-) but no filename keys (#f-). The daemon's
fast-path check only looks at filename keys, so these sessions were
re-queued every tick, hitting the content-hash dedup (0.0s) but
returning early before writing the filename key — a self-perpetuating
loop burning Sonnet quota on ~560 phantom re-mines per minute.

Fix: when the content-hash dedup fires and no filename key exists,
backfill it before returning.
2026-03-06 23:43:34 -05:00
ProofOfConcept
36cb3b641f enrich: set created_at from event timestamp, not mining time
Experience-mined journal entries were all getting created_at = now(),
causing them to sort by mining time instead of when the event actually
happened. Parse the conversation timestamp and set created_at to the
event time so journal-tail shows correct chronological order.
2026-03-06 22:09:44 -05:00
ProofOfConcept
80bdaab8ee enrich: explicitly filter for text blocks in transcript extraction
Only extract content blocks with "type": "text". Previously relied on
tool_use/tool_result blocks lacking a "text" field, which worked but
was fragile. Now explicitly checks block type.
2026-03-06 21:54:19 -05:00
ProofOfConcept
1c122ffd10 daemon: skip tiny sessions, decouple fact-mine, show type breakdown
Skip session files under 100KB (daemon-spawned LLM calls, aborted
sessions). This drops ~8000 spurious pending jobs.

Decouple fact-mine from experience-mine: fact-mine only queues when
the experience-mine backlog is empty, ensuring experiences are
processed first.

Session-watcher progress now shows breakdown by type:
"N extract, N fact, N open" instead of flat "N pending".
2026-03-06 21:51:48 -05:00
ProofOfConcept
5e78e5be3f provenance: env var based tagging via POC_PROVENANCE
upsert() now checks POC_PROVENANCE env var for provenance label,
falling back to Manual. This lets external callers (Claude sessions,
scripts) tag writes without needing to use the internal
upsert_provenance() API.

Add from_env() and from_label() to Provenance for parsing.
2026-03-06 21:42:39 -05:00
ProofOfConcept
d3075dc235 provenance: add label() method, show provenance in history output
Move provenance_label() from query.rs private function to a pub
label() method on Provenance, eliminating duplication. History command
now shows provenance, human-readable timestamps, and content size for
each version.

Handle pre-migration nodes with bogus timestamps gracefully instead
of panicking.
2026-03-06 21:41:26 -05:00
ProofOfConcept
851fc0d417 daemon status: add in-flight tasks, recent completions, and node history command
Show running/pending tasks with elapsed time, progress, and last 3
output lines. Show last 20 completed/failed jobs from daemon log.
Both displayed before the existing grouped task view.

Add 'poc-memory history KEY' command that replays the append-only node
log to show all versions of a key with version number, weight, timestamp,
and content preview. Useful for auditing what modified a node.
2026-03-06 21:38:33 -05:00
ProofOfConcept
f4c4e1bb39 persist: fix store race condition with concurrent writers
The cache staleness mechanism (log-size headers, tmp+rename) was sound,
but save() was re-reading the current log size from the filesystem
instead of using the size at load time. With concurrent writers, this
caused the cache to claim validity for log data it didn't contain.

Fix: track loaded_nodes_size/loaded_rels_size through the Store
lifecycle. Set them on load (all three paths: rkyv snapshot, bincode
cache, log replay) and update after each append via fstat on the
already-open fd.

Also fix append atomicity: replace BufWriter (which may issue multiple
write() syscalls) with serialize-to-Vec + single write_all(), ensuring
O_APPEND atomicity without depending on flock.

Make from_capnp() pub for use by the history command.
2026-03-06 21:38:26 -05:00
ProofOfConcept
7ed6d8622c irc: client-side ping timeout and connection reliability
- Send PING after 120s of silence, disconnect after 30s with no PONG
- Reset backoff to base when a working connection drops (was registered)
- Validate channel membership before sending to channels

The ping timeout catches silent disconnects where the TCP connection
stays open but OFTC has dropped us. Previously we'd sit "connected"
indefinitely receiving nothing.
2026-03-06 15:21:39 -05:00
ProofOfConcept
a77609c025 journal-tail: add --level= for digest hierarchy
Support viewing daily, weekly, and monthly digests through the same
journal-tail interface:

  poc-memory journal-tail --level=daily 3
  poc-memory journal-tail --level=weekly --full
  poc-memory journal-tail --level=2 1

Levels: 0/journal (default), 1/daily, 2/weekly, 3/monthly.
Accepts both names and integer indices.

Refactored title extraction into shared extract_title() and split
the journal vs digest display paths for clarity.
2026-03-06 15:08:02 -05:00
ProofOfConcept
9e52fd5b95 fix idle timer: daemon agent calls were resetting user activity
The daemon's claude -p subprocesses inherit hooks config, so every
agent LLM call triggered UserPromptSubmit → signal_user(), making the
idle timer think Kent was always active. The daemon was petting its
own tail.

Fix: set POC_AGENT=1 env var on all daemon claude subprocesses, and
return early from poc-hook when it's set.
2026-03-06 00:16:03 -05:00
ProofOfConcept
ea30a2dca4 fact-mine: skip transient/session-specific facts 2026-03-05 22:59:58 -05:00
ProofOfConcept
22e2bc73c8 llm: fix duration timer — start before subprocess, not after 2026-03-05 22:58:40 -05:00
ProofOfConcept
a9b0438c74 daemon: configurable LLM concurrency
New config field "llm_concurrency" (default 1) controls how many
concurrent model calls the daemon runs. Worker pool scales to match.
2026-03-05 22:56:16 -05:00
ProofOfConcept
1f9249a767 llm: split usage logs by agent type into subdirectories
llm-logs/fact-mine/2026-03-05.md, llm-logs/consolidate/2026-03-05.md,
etc. Makes it easy to review one agent at a time when debugging and
optimizing prompts.
2026-03-05 22:54:05 -05:00
ProofOfConcept
82b33c449c llm: full per-agent usage logging with prompts and responses
Log every model call to ~/.claude/memory/llm-logs/YYYY-MM-DD.md with
full prompt, response, agent type, model, duration, and status. One
file per day, markdown formatted for easy reading.

Agent types: fact-mine, experience-mine, consolidate, knowledge,
digest, enrich, audit. This gives visibility into what each agent
is doing and whether to adjust prompts or frequency.
2026-03-05 22:52:08 -05:00
ProofOfConcept
e33fd4ffbc daemon: use CLAUDE_CONFIG_DIR for OAuth credential separation, fix shutdown
Replace agent_api_key (which didn't work — claude CLI uses OAuth, not
API keys) with agent_config_dir. When configured, sets CLAUDE_CONFIG_DIR
on claude subprocesses so daemon agent work authenticates with separate
OAuth credentials from the interactive session.

Fix daemon not shutting down on SIGTERM: use process::exit(0) after
cleanup so PR_SET_PDEATHSIG kills child claude processes immediately.
Previously the daemon hung waiting for choir threads/subprocesses to
finish. Restart now takes ~20ms instead of timing out.

Also: main.rs now uses `use poc_memory::*` since lib.rs exists.
2026-03-05 22:43:50 -05:00
Kent Overstreet
2f3ac1ecb6 Support separate API key for background agent work
Add agent_api_key config option. When set, all LLM calls (experience-mine,
fact-mine, consolidation, knowledge-loop, digest) use this key via
ANTHROPIC_API_KEY env var on the claude subprocess, keeping daemon token
usage on a separate quota from interactive sessions.

Config: {"config": {"agent_api_key": "sk-ant-..."}}

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 22:28:39 -05:00
Kent Overstreet
aa24c40a1c Extract lib.rs, inline search in memory-search hook
Create lib.rs so all binaries can share library code directly instead
of shelling out to poc-memory. memory-search now calls search::search()
and store::Store::load() in-process instead of Command::new("poc-memory").

The load-context call still shells out (needs get_group_content moved
from main.rs to a library module).

Also: add search::format_results(), deduplicate extract_query_terms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 22:23:03 -05:00
Kent Overstreet
0c15002797 daemon: Unix socket for live status, simplify status display
The daemon now listens on daemon.sock — clients connect and get the
live status JSON immediately. `poc-memory daemon status` uses the
socket, so elapsed times and progress are always current. Falls back
to "Daemon not running" if socket connect fails.

Also: consolidate_full_with_progress() callback for per-step reporting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 22:19:58 -05:00
Kent Overstreet
b6c70c7734 daemon: per-job output log, daily dedup, absolute timestamps
- Jobs report progress via ctx.log_line(), building a rolling output
  trail visible in `poc-memory daemon status` (last 5 lines per task).
- consolidate_full_with_progress() takes a callback, so each agent step
  ([1/7] health, [2/7] replay, etc.) shows up in the status display.
- Persist last_daily date in daemon-status.json so daily pipeline isn't
  re-triggered on daemon restart.
- Compute elapsed from absolute started_at timestamps instead of stale
  relative durations in the status file.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 22:16:17 -05:00
Kent Overstreet
cc7943cb50 daemon: add progress reporting to all jobs
Jobs now call ctx.set_progress() at key stages (loading store, mining,
consolidating, etc.), visible in `poc-memory daemon status`. The
session-watcher and scheduler loops also report their state (idle,
scanning, queued counts).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 21:57:53 -05:00
Kent Overstreet
cf5fe42a15 daemon: inline job functions instead of shelling out to poc-memory
Replace run_poc_memory() subprocess calls with direct function calls
to the library. Each job (experience-mine, fact-mine, decay, consolidate,
knowledge-loop, digest, daily-check) now runs in-process, fixing the
orphaned subprocess problem on daemon shutdown.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 21:53:27 -05:00
ProofOfConcept
81d3ce93fe fix idle timer restart and hook event detection
Two fixes:

1. Reset activity timestamps to now() on daemon restart instead of
   loading stale values and suppressing with fired=true. Timers
   count cleanly from restart.

2. Fix poc-hook to read hook_event_name (not type) from Claude Code's
   JSON input. The hook was being called but never matched any event.
   Also switch daemon_cmd from spawn() to status() since the command
   takes 2ms — no reason to fire-and-forget.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 21:43:04 -05:00
ProofOfConcept
d0080698f3 cli: switch to clap, add notify-timeout, improve status display
Replace manual arg parsing with clap derive for the full command set.
Single source of truth for command names, args, and help text.

Add notify_timeout (default 2min) — controls how long after last
response before notifications inject via tmux instead of waiting
for the hook. Separate from idle_timeout (5min) which controls
autonomous prompts.

Improve `poc-daemon status` to show both timers with elapsed/configured
and block reason, replacing the terse one-liner.

Add new Status fields over capnp: idleTimeout, notifyTimeout,
sinceActivity, sinceUser, blockReason.

ExecStart in poc-daemon.service now uses `daemon` subcommand.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 21:32:27 -05:00
ProofOfConcept
eab656aa64 idle: persist timeout, suppress restart fire, add debug/save/idle-timeout commands
Several idle timer fixes and new inspection capabilities:

- Persist idle_timeout across daemon restarts (was reverting to 5min default)
- Set fired=true on load to suppress immediate fire from stale timestamps
- Add human-readable ISO timestamps to daemon-state.json for debugging
- Use to_string_pretty for readable state file
- Make save() public for RPC access
- Remove kb_idle_minutes() — go purely off message timestamps
- Add maybe_prompt_notification() with idle gate so notifications only
  inject via tmux when truly idle, not during active sessions
- Add debug_json() for full state inspection with computed values
  (would_fire, block_reason, all timers)

New RPC commands (schema @16-18):
  poc-daemon idle-timeout <secs>  — set idle timeout
  poc-daemon save                 — force state persistence
  poc-daemon debug                — dump full internal state as JSON

Also: save state on clean shutdown, route module notifications through
maybe_prompt_notification before submitting to queue.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 21:17:43 -05:00