Spawn memory-search --hook as a subprocess, piping the hook input
JSON through stdin and printing its stdout. This ensures memory
context injection goes through the same hook whose output Claude
Code reliably persists, fixing the issue where memory-search as a
separate hook had its output silently dropped.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Move JsonlBackwardIter and find_last_compaction() from
parse-claude-conversation into a shared transcript module. Both
memory-search and parse-claude-conversation now use the same robust
compaction detection: mmap-based backward scan, JSON parsing to
verify user-type message, content prefix check.
Replaces memory-search's old detect_compaction() which did a forward
scan with raw string matching on "continued from a previous
conversation" — that could false-positive on the string appearing
in assistant output or tool results.
Add parse-claude-conversation as a new binary for debugging what's
in the context window post-compaction.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
chrono's timestamp_opt can return None during DST transitions.
Handle all three variants (Single, Ambiguous, None) instead of
unwrapping. For DST gaps, offset by one hour to land in valid
local time.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Replace monolithic consolidate job with individual agent jobs
(replay, linker, separator, transfer, health) that run sequentially
and store reports. Multi-phase daily pipeline: agent runs → apply
actions → link orphans → cap degree → digest → digest links →
knowledge loop.
Add GraphHealth struct with graph metrics (alpha, gini, clustering
coefficient, episodic ratio) computed during health checks. Display
in `poc-memory daemon status`. Use cached metrics to build
consolidation plan without expensive O(n²) interference detection.
Add RPC consolidate command to trigger consolidation via socket.
Harden session watcher: skip transcripts with zero segments, improve
migration error handling.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Node weight no longer gates signal propagation — only edge_decay
and edge_strength affect traversal. Node weight is applied at the
end for ranking. This lets low-weight nodes serve as bridges
without killing the signal passing through them.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
All seeds emit at once. At each hop, activations from all sources
sum at each node, and the combined map propagates on the next hop.
Nodes where multiple wavefronts overlap get reinforced and radiate
stronger — natural interference patterns.
Lower default min_activation threshold (×0.1) since individual
contributions are smaller in additive mode.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Initialize direction from the two most spectrally separated seeds
instead of relying on input order (which was alphabetical from
BTreeMap). Run 3 rounds of power iteration with normalization
instead of 1 for better convergence.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Three new composable search stages:
confluence — multi-source spreading activation. Unlike spread (which
takes max from any source), confluence rewards nodes reachable from
multiple seeds additively. Naturally separates unrelated seed groups
since their neighborhoods don't overlap. Params: max_hops, edge_decay,
min_sources.
geodesic — straightest path between seed pairs in spectral space.
At each graph hop, picks the neighbor whose spectral direction most
aligns with the target (cosine similarity of direction vectors).
Nodes on many geodesic paths score highest. Params: max_path, k.
manifold — extrapolation along the direction seeds define. Computes
weighted centroid + principal axis of seeds in spectral space, then
scores candidates by projection onto that axis (penalized by
perpendicular distance). Finds what's "further along" rather than
"nearby." Params: k.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Replace O(n²) Vec::contains + sort/dedup with O(n) HashSet for orphan
node tracking in health_report(). Use imported HashMap type instead of
fully-qualified std::collections::HashMap.
- New agents/transcript.rs: shared JSONL parsing for enrich, fact_mine,
and knowledge (was 3 separate implementations, ~150 lines duplicated)
- New best_match() and section_children() helpers in neuro/rewrite.rs
(was duplicated find-best-by-similarity loop + section collection)
- Net -153 lines
- Replace `pub use types::*` in store/mod.rs with explicit re-export list
- Make transcript_dedup_key private in agents/enrich.rs (only used internally)
- Inline duplicated projects_dir() helper in agents/knowledge.rs and daemon.rs
Replace all partial_cmp().unwrap() with total_cmp() in spectral.rs
and knowledge.rs — eliminates potential panics on NaN without
changing behavior for normal floats.
Use existing weighted_distance() and eigenvalue_weights() helpers in
nearest_neighbors() and nearest_to_seeds() instead of inlining the
same distance computation.
Move parse_timestamp_to_epoch() from enrich.rs to util.rs — was
duplicated logic, now shared.
Replace O(n²) relation existence check in init_from_markdown() with
a HashSet of (source, target) UUID pairs. With 26K relations this
was scanning linearly for every link in every markdown unit.
Move prompts_dir into Config (was hardcoded ~/poc/memory/prompts).
Replace hardcoded ~/.claude/memory paths in spectral.rs, graph.rs,
and main.rs with store::memory_dir() or config::get(). Replace
hardcoded ~/.claude/projects in knowledge.rs and main.rs with
config::get().projects_dir.
Extract apply_agent_file() from cmd_apply_agent() — separates
file scanning from per-file JSON parsing and link application.
Add util::truncate() and util::first_n_chars() to replace 16 call
sites doing the same floor_char_boundary or chars().take().collect()
patterns. Deduplicate the batching loop in consolidate.rs (4 copies
→ 1 loop over an array). Fix all clippy warnings: redundant closures,
needless borrows, collapsible if, unnecessary cast, manual strip_prefix.
Net: -44 lines across 16 files.
Replace hand-rolled argument parsing (match on args[1], manual
iteration over &[String]) with Clap's derive macros. All 60+
subcommands now have typed arguments with defaults, proper help
text, and error messages generated automatically.
The 83-line usage() function is eliminated — Clap generates help
from the struct annotations. Nested subcommands (digest daily/
weekly/monthly/auto, journal-tail --level) use Clap's subcommand
nesting naturally.
poc-daemon (notification routing, idle timer, IRC, Telegram) was already
fully self-contained with no imports from the poc-memory library. Now it's
a proper separate crate with its own Cargo.toml and capnp schema.
poc-memory retains the store, graph, search, neuro, knowledge, and the
jobkit-based memory maintenance daemon (daemon.rs).
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Category was a manually-assigned label with no remaining functional
purpose (decay was the only behavior it drove, and that's gone).
Remove the enum, its methods, category_counts, the --category search
filter, and all category display. The field remains in the capnp
schema for backwards compatibility but is no longer read or written.
Status and health reports now show NodeType breakdown (semantic,
episodic, daily, weekly, monthly) instead of categories.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Replace hardcoded "identity" lookups with config.core_nodes so
experience mining and init work with whatever core nodes are
configured, not just a node named "identity".
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Graph-wide decay is the wrong approach — node importance should emerge
from graph topology (degree, centrality, usage patterns), not a global
weight field multiplied by a category-specific factor.
Remove: Store::decay(), Store::categorize(), Store::fix_categories(),
Category::decay_factor(), cmd_decay, cmd_categorize, cmd_fix_categories,
job_decay, and all category assignments at node creation time.
Category remains in the schema as a vestigial field (removing it
requires a capnp migration) but no longer affects behavior.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Replace key prefix matching (journal#j-, daily-, weekly-, monthly-)
with NodeType filters (EpisodicSession, EpisodicDaily, EpisodicWeekly,
EpisodicMonthly) for all queries: journal-tail, digest gathering,
digest auto-detection, experience mining dedup, and find_journal_node.
Add EpisodicMonthly to NodeType enum and capnp schema.
Key naming conventions (journal#j-TIMESTAMP-slug, daily-DATE, etc.)
are retained for key generation — the fix is about how we find nodes,
not how we name them.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
All nodes in the store are memory — none should be excluded from
knowledge extraction, search, or graph algorithms by name. Removed
the MEMORY/where-am-i/work-queue/work-state skip lists entirely.
Deleted where-am-i and work-queue nodes from the store (ephemeral
scratchpads that don't belong). Added orphan edge pruning to fsck
so broken links get cleaned up automatically.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Journal and digest nodes are episodic memory — they should participate
in the graph on the same terms as everything else. Remove all
journal#/daily-/weekly-/monthly- skip filters from knowledge
extraction, connector pairs, challenger, semantic keys, and link
candidate selection. Use node_type field instead of key name matching
for episodic/semantic classification.
Operational nodes (MEMORY, where-am-i, work-queue, work-state) are
still filtered — they're system state, not memory.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
replay_nodes now tracks all UUIDs per key using a temporary multimap.
Warns on duplicates so they can be manually resolved.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Keys were a vestige of the file-based era. resolve_key() added .md
to lookups while upsert() used bare keys, creating phantom duplicate
nodes (the instructions bug: writes went to "instructions", reads
found "instructions.md").
- Remove .md normalization from resolve_key, strip instead
- Update all hardcoded key patterns (journal.md# → journal#, etc)
- Add strip_md_keys() migration to fsck: renames nodes and relations
- Add broken link detection to health report
- Delete redirect table (no longer needed)
- Update config defaults and config.jsonl
Migration: run `poc-memory fsck` to rename existing keys.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
The clear sequence (Escape q C-c C-u) was disrupting Claude Code's
input state, causing nudge messages to arrive as blank prompts.
Simplified to just literal text + Enter.
Without -l, tmux send-keys treats spaces as key-name separators,
so multi-word messages like "This is your time" get split into
individual unrecognized key names instead of being typed as text.
This caused idle nudges to arrive as blank messages.
Add `poc-daemon afk` to immediately mark Kent as away, allowing the
idle timer to fire without waiting for the session active timeout.
Add `poc-daemon session-timeout <secs>` to configure how long after
the last message Kent counts as "present" (default 15min, persisted).
Fix block_reason() to report "kent present" and "in turn" states
that were checked in the tick but not in the diagnostic output.
Decay is metadata, not content. Bumping version caused unnecessary
log churn and premature cache invalidation.
Also disable auto-decay in scheduler — was causing version spam
and premature demotion of useful nodes.
Add optional progress callback to mine_transcript/mine_and_store so
the daemon can display per-chunk status. Sort fact-mine queue by file
size so small transcripts drain first. Write empty marker for
transcripts with no facts to avoid re-queuing them.
Also hardens the extraction prompt suffix.
Reads each capnp log message sequentially, validates framing and
content. On first corrupt message, truncates to last good position
and removes stale caches so next load replays from repaired log.
Wired up as `poc-memory fsck`.
README is now just an overview with links. Component docs:
- docs/memory.md: store design, algorithms, config, CLI reference
- docs/hooks.md: Claude Code integration setup
- docs/daemon.md, docs/notifications.md: from previous commit
Claude Code doesn't create new session files on context compaction —
a single UUID can accumulate 170+ conversations, producing 400MB+
JSONL files that generate 1.3M token prompts.
Split at compaction markers ("This session is being continued..."):
- extract_conversation made pub, split_on_compaction splits messages
- experience_mine takes optional segment index
- daemon watcher parses files, spawns per-segment jobs (.0, .1, .2)
- seg_cache memoizes segment counts across ticks
- per-segment dedup keys; whole-file key when all segments complete
- 150K token guard skips any remaining oversized segments
- char-boundary-safe truncation in enrich.rs and fact_mine.rs
Backwards compatible: unsegmented calls still write content-hash
dedup keys, old whole-file mined keys still recognized.
Track activity level as an EWMA (exponentially weighted moving average)
driven by turn duration. Long turns (engaged work) produce large boosts;
short turns (bored responses) barely register.
Asymmetric time constants: 60s boost half-life for fast wake-up, 5-minute
decay half-life for gradual wind-down. Self-limiting boost formula
converges toward 0.75 target — can't overshoot.
- Add activity_ewma, turn_start, last_nudge to persisted state
- Boost on handle_response proportional to turn duration
- Decay on every tick and state transition
- Fix kent_present: self-nudge responses (fired=true) don't update
last_user_msg, so kent_present stays false during autonomous mode
- Nudge only when Kent is away, minimum 15s between nudges
- CLI: `poc-daemon ewma [VALUE]` to query or set
- Status output shows activity percentage
The early return on line 343 when the LLM found no missed experiences
bypassed the dedup key writes at lines 397-414, despite the comment
saying "even if count == 0, to prevent re-runs." This caused sessions
with nothing to mine to be re-mined every 60s tick indefinitely.
Fix: replace the early return with a conditional print, so the dedup
keys are always written and saved.
Transcripts mined before the filename-key feature was added had
content-hash keys (#h-) but no filename keys (#f-). The daemon's
fast-path check only looks at filename keys, so these sessions were
re-queued every tick, hitting the content-hash dedup (0.0s) but
returning early before writing the filename key — a self-perpetuating
loop burning Sonnet quota on ~560 phantom re-mines per minute.
Fix: when the content-hash dedup fires and no filename key exists,
backfill it before returning.
Experience-mined journal entries were all getting created_at = now(),
causing them to sort by mining time instead of when the event actually
happened. Parse the conversation timestamp and set created_at to the
event time so journal-tail shows correct chronological order.
Only extract content blocks with "type": "text". Previously relied on
tool_use/tool_result blocks lacking a "text" field, which worked but
was fragile. Now explicitly checks block type.
Skip session files under 100KB (daemon-spawned LLM calls, aborted
sessions). This drops ~8000 spurious pending jobs.
Decouple fact-mine from experience-mine: fact-mine only queues when
the experience-mine backlog is empty, ensuring experiences are
processed first.
Session-watcher progress now shows breakdown by type:
"N extract, N fact, N open" instead of flat "N pending".
upsert() now checks POC_PROVENANCE env var for provenance label,
falling back to Manual. This lets external callers (Claude sessions,
scripts) tag writes without needing to use the internal
upsert_provenance() API.
Add from_env() and from_label() to Provenance for parsing.
Move provenance_label() from query.rs private function to a pub
label() method on Provenance, eliminating duplication. History command
now shows provenance, human-readable timestamps, and content size for
each version.
Handle pre-migration nodes with bogus timestamps gracefully instead
of panicking.