Commit graph

108 commits

Author SHA1 Message Date
ProofOfConcept
57c26d8157 reorganize subcommands into logical groups
60+ flat subcommands grouped into:
- Core (daily use): search, render, write, history, tail, status, query, used, wrong, gap
- Node: delete, rename, list, edges, dump
- Journal: write, tail, enrich
- Graph: link, audit, spectral, etc.
- Agent: daemon, knowledge-loop, consolidate, digest, etc.
- Admin: init, health, fsck, import, export, etc.

Also: remove dead migration code (migrate.rs, Migrate/JournalTsMigrate commands),
update memory-search and poc-hook for new subcommand paths, update daemon systemd
template for `agent daemon` path.
2026-03-11 01:32:21 -04:00
Kent Overstreet
d76b14dfcd provenance: convert from enum to freeform string
The Provenance enum couldn't represent agents defined outside the
source code. Replace it with a Text field in the capnp schema so any
agent can write its own provenance label (e.g. "extractor:write",
"rename:tombstone") without a code change.

Schema: rename old enum fields to provenanceOld, add new Text
provenance fields. Old enum kept for reading legacy records.
Migration: from_capnp_migrate() falls back to old enum when the
new text field is empty.

Also adds `poc-memory tail` command for viewing recent store writes.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-11 01:19:52 -04:00
ProofOfConcept
de204e3075 agents: surface search hit counts to guide keep/demote decisions
Nodes actively found by search now show "Search hits: N ← actively
found by search, prefer to keep" in both the node section (seen by
extractor, linker, etc.) and rename candidate listings.

Extractor and rename prompts updated to respect this signal — merge
into high-hit nodes rather than demoting them, skip renaming nodes
that are working well in search.
2026-03-11 00:18:58 -04:00
ProofOfConcept
7a3ce4f17d counters: wire redb search hits into daemon RPC
memory-search now records which nodes it finds via the daemon's
record-hits RPC endpoint. The daemon owns the redb database
exclusively, avoiding file locking between processes.

The rename agent reads hit counts to deprioritize nodes that are
actively being found by search — renaming them would break working
queries. Daily check decays counters by 10% so stale hits fade.

Also switched RPC command reading from fixed 256-byte buffer to
read_to_string for unbounded command sizes.
2026-03-11 00:13:58 -04:00
ProofOfConcept
884939b146 counters: add redb-backed persistent counters (skeleton)
First use case: search hit tracking for rename protection. Nodes
that memory-search actively finds shouldn't be renamed.

The counters module provides increment/read/decay operations backed
by redb (pure Rust, ACID, no C deps). Next step: wire into the
poc-memory daemon via RPC so the daemon owns the DB exclusively
and memory-search sends hits via RPC.

Also reverts the JSONL search-hits approach in favor of this.
2026-03-10 23:59:39 -04:00
ProofOfConcept
9fef98b01e rename: sort candidates by least-recently visited
Instead of a hard 7-day cutoff, sort rename candidates so the
least-recently visited come first. Naturally prioritizes unseen
nodes while allowing revisits once everything's been through.
2026-03-10 23:50:32 -04:00
ProofOfConcept
11cbd9664a naming: strip backticks from Haiku responses
Haiku sometimes wraps its CREATE/RENAME/MERGE_INTO lines in
backticks. Strip them before parsing so the response is recognized.
2026-03-10 23:40:38 -04:00
ProofOfConcept
93f98a0a5d llm: add 5-minute timeout to claude subprocess
The daemon was getting stuck when a claude subprocess hung — no
completion logged, job blocked forever, pending queue growing.

Use spawn() + watchdog thread instead of blocking output(). The
watchdog sleeps in 1s increments checking a cancel flag, sends
SIGTERM at 5 minutes, SIGKILL after 5s grace. Cancel flag ensures
the watchdog exits promptly when the child finishes normally.
2026-03-10 23:29:01 -04:00
ProofOfConcept
b62fffc326 naming agent: resolve node names before creation
Any time an agent creates a new node (WRITE_NODE) or the fact miner
stores extracted facts, a naming sub-agent now checks for conflicts
and ensures the key is meaningful:

- find_conflicts() searches existing nodes via component matching
- Haiku LLM decides: CREATE (good name), RENAME (better name),
  or MERGE_INTO (fold into existing node)
- WriteNode actions may be converted to Refine on MERGE_INTO

Also updates the rename agent to handle _facts-<UUID> nodes —
these are no longer skipped, and the prompt explains how to name
them based on their domain/claim content.
2026-03-10 23:23:14 -04:00
ProofOfConcept
15dedea322 search: make component and content matching opt-in
Default search now uses exact key match only. Component matching
(--fuzzy) and content search (--content) are explicit flags. This
makes missing graph structure visible instead of silently falling
back to broad matching.
2026-03-10 23:01:46 -04:00
ProofOfConcept
12dd320a29 extractor: rewrite as knowledge organizer
Shift from pattern abstraction (creating new nodes) to distillation
(refining existing nodes, demoting redundancies). Priority order:
merge redundancies > file into existing > improve existing > create new.

Query changed to neighborhood-aware: seed → spread → limit, so the
extractor works on related nodes rather than random high-priority ones.
2026-03-10 22:57:13 -04:00
ProofOfConcept
9d29e392a8 agents: add DEMOTE action for redundancy cleanup
New action type that halves a node's weight (min 0.05), enabling
extractors to mark redundant nodes for decay without deleting them.

Parser, apply logic, depth computation, and display all updated.
2026-03-10 22:57:02 -04:00
ProofOfConcept
8ba58ce9cd neuro: unify consolidation planning, fix threshold drift
The daemon's compute_graph_health had a duplicated copy of the
consolidation planning thresholds that had drifted from the canonical
version (α<2.0 → +7 replay in daemon vs +10 in neuro).

Split consolidation_plan into _inner(store, detect_interference) so
the daemon can call consolidation_plan_quick (skips O(n²) interference)
while using the same threshold logic.
2026-03-10 17:55:08 -04:00
ProofOfConcept
945865f594 agents: extract run_and_apply, eliminate dead split-plan.md
- Add run_and_apply() — combines run_one_agent + action application
  into one call. Used by daemon job_consolidation_agent and
  consolidate_full, which had identical run+apply loops.

- Port split_plan_prompt() to use split.agent via defs::resolve_placeholders
  instead of loading the separate split-plan.md template. Make
  resolve_placeholders public for this.

- Delete prompts/split-plan.md — superseded by agents/split.agent
  which was already the canonical definition.
2026-03-10 17:51:32 -04:00
ProofOfConcept
abab85d249 agents: deduplicate timestamps, plan expansion, rename agent
- Add compact_timestamp() to store — replaces 5 copies of
  format_datetime(now_epoch()).replace([':', '-', 'T'], "")
  Also fixes missing seconds (format_datetime only had HH:MM).

- Add ConsolidationPlan::to_agent_runs() — replaces identical
  plan-to-runs-list expansion in consolidate.rs and daemon.rs.

- Port job_rename_agent to use run_one_agent — eliminates manual
  prompt building, LLM call, report storage, and visit recording
  that duplicated the shared pipeline.

- Rename Confidence::weight()/value() to delta_weight()/gate_value()
  to clarify the distinction (delta metrics vs depth gating).
2026-03-10 17:48:00 -04:00
ProofOfConcept
fe7f636ad3 agents: extract shared run_one_agent, standardize output formats
Three places duplicated the agent execution loop (build prompt → call
LLM → store output → parse actions → record visits): consolidate.rs,
knowledge.rs, and daemon.rs. Extract into run_one_agent() in
knowledge.rs that all three now call.

Also standardize consolidation agent prompts to use WRITE_NODE/LINK/REFINE
— the same commands the parser handles. Previously agents output
CATEGORIZE/NOTE/EXTRACT/DIGEST/DIFFERENTIATE/MERGE/COMPRESS which were
silently dropped after the second-LLM-call removal.
2026-03-10 17:33:12 -04:00
ProofOfConcept
f6ea659975 consolidate: eliminate second LLM call, apply actions inline
The consolidation pipeline previously made a second Sonnet call to
extract structured JSON actions from agent reports. This was both
wasteful (extra LLM call per consolidation) and lossy (only extracted
links and manual items, ignoring WRITE_NODE/REFINE).

Now actions are parsed and applied inline after each agent runs, using
the same parse_all_actions() parser as the knowledge loop. The daemon
scheduler's separate apply phase is also removed.

Also deletes 8 superseded/orphaned prompt .md files (784 lines) that
have been replaced by .agent files.
2026-03-10 17:22:53 -04:00
ProofOfConcept
91878d17a0 agents: port knowledge agents to .agent files with visit tracking
The four knowledge agents (observation, extractor, connector,
challenger) were hardcoded in knowledge.rs with their own node
selection logic that bypassed the query pipeline and visit tracking.

Now they're .agent files like the consolidation agents:
- extractor: not-visited:extractor,7d | sort:priority | limit:20
- observation: uses new {{CONVERSATIONS}} placeholder
- connector: type:semantic | not-visited:connector,7d
- challenger: type:semantic | not-visited:challenger,14d

The knowledge loop's run_cycle dispatches through defs::run_agent
instead of calling hardcoded functions, so all agents get visit
tracking automatically. This means the extractor now sees _facts-*
and _mined-transcripts nodes that it was previously blind to.

~200 lines of dead code removed (old runner functions, spectral
clustering for node selection, per-agent LLM dispatch).

New placeholders in defs.rs:
- {{CONVERSATIONS}} — raw transcript fragments for observation agent
- {{TARGETS}} — alias for {{NODES}} (challenger compatibility)

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-10 17:04:44 -04:00
ProofOfConcept
7d6ebbacab daemon: add run-agent RPC for queuing agent jobs
Adds `poc-memory daemon run-agent <type> <count>` CLI command that
sends an RPC to the daemon to queue agent runs, instead of spawning
separate processes.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-10 17:04:30 -04:00
ProofOfConcept
a505b9384e agents: fix agent file parser to split on first newline
The parser was using split_once("\n\n") which broke when the prompt
started immediately after the JSON header (no blank line). Parse
the first line as JSON, treat the rest as the prompt body.
2026-03-10 16:03:10 -04:00
ProofOfConcept
46db2e7237 agents: remove hardcoded dispatch, clean up pub wrappers
All agents now go through the config-driven path via .agent files.
agent_prompt() just delegates to defs::run_agent(). Remove the 100+
line hardcoded match block and the _pub wrapper functions — make the
formatters pub directly.
2026-03-10 15:53:53 -04:00
ProofOfConcept
16c749f798 agents: placeholder-based prompt templates, port remaining 4 agents
Replace the formatter dispatch with a generic {{placeholder}} lookup
system. Placeholders in prompt templates are resolved at runtime from
a table: topology, nodes, episodes, health, pairs, rename, split.

The query in the header selects what to operate on (keys for visit
tracking); placeholders pull in formatted context. Placeholders that
produce their own node selection (pairs, rename) contribute keys back.

Port health, separator, rename, and split agents to .agent files.
All 7 agents now use the config-driven path.
2026-03-10 15:50:54 -04:00
Kent Overstreet
b4e674806d agents: self-contained agent files with embedded prompts
Each agent is a .agent file: JSON config on the first line, blank line,
then the raw prompt markdown. Fully self-contained, fully readable.
No separate template files needed.

Agents dir: checked into repo at poc-memory/agents/. Code looks there
first (via CARGO_MANIFEST_DIR), falls back to ~/.claude/memory/agents/.

Three agents migrated: replay, linker, transfer.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-10 15:29:55 -04:00
Kent Overstreet
e736471d99 search: unified query pipeline with filters, transforms, generators
Extend the pipeline with four stage types composing left-to-right:
  Generators: all, match:TERM
  Filters: type:, key:, weight:, age:, content-len:, provenance:,
           not-visited:, visited: (plus ! negation)
  Transforms: sort:(priority|timestamp|content-len|degree|weight), limit:N
  Algorithms: spread, spectral, confluence, geodesic, manifold (unchanged)

Duration syntax (7d, 24h, 30m) and glob matching on keys.
CLI auto-detects filter/transform stages and loads full Store;
algorithm-only pipelines keep the fast MmapView path.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-10 15:22:12 -04:00
Kent Overstreet
c6bb7c3910 util: add jsonl_load/jsonl_append helpers, convert graph.rs
Shared JSONL read/write in util.rs replaces hand-rolled patterns.
graph.rs metrics load/save converted as first consumer.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-10 15:22:03 -04:00
ProofOfConcept
0e1e5a1981 agent visits: track when agents successfully process nodes
New append-only visits.capnp log records which agent processed which
node and when. Only recorded on successful completion — transient
errors don't mark nodes as "seen."

Schema: AgentVisit{nodeUuid, nodeKey, agent, timestamp, outcome}
Storage: append_visits(), replay_visits(), in-memory VisitIndex
Recording: daemon records visits after successful LLM call
API: agent_prompt() returns AgentBatch{prompt, node_keys} so callers
know which nodes to mark as visited.

Groundwork for using visit recency in agent node selection — agents
will deprioritize recently-visited nodes.
2026-03-10 14:30:53 -04:00
ProofOfConcept
9f14a29181 dedup: find and merge duplicate nodes with edge redirection
New `poc-memory dedup` command (--apply for live run, dry-run by
default). Finds nodes sharing the same key but different UUIDs,
classifies them as identical or diverged, picks a survivor
(prefer most edges, then highest version), tombstones the rest,
and redirects all edges from doomed UUIDs to the survivor.
2026-03-10 14:30:33 -04:00
ProofOfConcept
37ae37667b store: lock-refresh-write pattern to prevent duplicate UUIDs
All write paths (upsert_node, upsert_provenance, delete_node,
rename_node, ingest_units) now hold StoreLock across the full
refresh→check→write cycle. This prevents the race where two
concurrent processes both see a key as "new" and create separate
UUIDs for it.

Adds append_nodes_unlocked() and append_relations_unlocked() for
callers already holding the lock. Adds refresh_nodes() to replay
log tail under lock before deciding create vs update.

Also adds find_duplicates() for detecting existing duplicates
in the log (replays full log, groups live nodes by key).
2026-03-10 14:30:21 -04:00
ProofOfConcept
8bbc246b3d split agent: parallel execution, agent-driven edges, no MCP overhead
- Refactor split from serial batch to independent per-node tasks
  (run-agent split N spawns N parallel tasks, gated by llm_concurrency)
- Replace cosine similarity edge inheritance with agent-assigned
  neighbors in the plan JSON — the LLM already understands the
  semantic relationships, no need to approximate with bag-of-words
- Add --strict-mcp-config to claude CLI calls to skip MCP server
  startup (saves ~5s per call)
- Remove hardcoded 2000-char split threshold — let the agent decide
  what's worth splitting
- Reload store before mutations to handle concurrent split races
2026-03-10 03:21:33 -04:00
ProofOfConcept
149c289fea split agent: filter candidates to semantic nodes only
Episodic nodes (journal entries, digests) are narratives that should
not be split even when large. Only semantic reference nodes that have
grown to cover multiple topics are candidates.
2026-03-10 01:58:15 -04:00
ProofOfConcept
ca62692a28 split agent: two-phase node decomposition for memory consolidation
Phase 1 sends a large node with its neighbor communities to the LLM
and gets back a JSON split plan (child keys, descriptions, section
hints). Phase 2 fires one extraction call per child in parallel —
each gets the full parent content and extracts/reorganizes just its
portion.

This handles arbitrarily large nodes because output is always
proportional to one child, not the whole parent. Tested on the kent
node (19K chars → 3 children totaling 20K chars with clean topic
separation).

New files:
  prompts/split-plan.md   — phase 1 planning prompt
  prompts/split-extract.md — phase 2 extraction prompt
  prompts/split.md        — original single-phase (kept for reference)

Modified:
  agents/prompts.rs — split_candidates(), split_plan_prompt(),
                      split_extract_prompt(), agent_prompt "split" arm
  agents/daemon.rs  — job_split_agent() two-phase implementation,
                      RPC dispatch for "split" agent type
  tui.rs            — added "split" to AGENT_TYPES
2026-03-10 01:48:41 -04:00
ProofOfConcept
4c973183c4 rename agent: LLM-powered semantic key generation for memory nodes
New consolidation agent that reads node content and generates semantic
3-5 word kebab-case keys, replacing auto-generated slugs (5K+ journal
entries with truncated first-line slugs, 2.5K mined transcripts with
opaque UUIDs).

Implementation:
- prompts/rename.md: agent prompt template with naming conventions
- prompts.rs: format_rename_candidates() selects nodes with long
  auto-generated keys, newest first
- daemon.rs: job_rename_agent() parses RENAME actions from LLM
  output and applies them directly via store.rename_node()
- Wired into RPC handler (run-agent rename) and TUI agent types
- Fix epoch_to_local panic on invalid timestamps (fallback to UTC)

Rename dramatically improves search: key-component matching on
"journal#2026-02-28-violin-dream-room" makes the node findable by
"violin", "dream", or "room" — the auto-slug was unsearchable.
2026-03-10 00:55:26 -04:00
ProofOfConcept
ef760f0053 poc-memory status: add ratatui TUI dashboard
Per-agent-type tabs (health, replay, linker, separator, transfer,
apply, orphans, cap, digest, digest-links, knowledge) with dynamic
visibility — tabs only appear when tasks or log history exist.

Features:
- Overview tab: health gauges (α, gini, cc, episodic%), in-flight
  tasks, and recent log entries
- Pipeline tab: table with phase ordering and status
- Per-agent tabs: active tasks, output logs, log history
- Log tab: auto-scrolling daemon.log tail
- Vim-style count prefix: e.g. 5r runs 5 iterations of the agent
- Flash messages for RPC feedback
- Tab/Shift-Tab navigation, number keys for tab selection

Also adds run-agent RPC to the daemon: accepts agent type and
iteration count, spawns chained tasks with LLM resource pool.

poc-memory status launches TUI when stdout is a terminal and daemon
is running, falls back to text output otherwise.
2026-03-10 00:41:29 -04:00
ProofOfConcept
06df66cf4c memory-search: add fuzzy key matching and content-based seed extraction
match_seeds() previously only found nodes whose keys exactly matched
search terms. This meant searches like "formal verification" or
"bcachefs plan" returned nothing — no nodes are keyed with those
exact strings.

Three-tier matching strategy:
1. Exact key match (full weight) — unchanged
2. Key component match (0.5× weight) — split keys on -/_/./#,
   match individual words. "plan" now finds "the-plan", "verification"
   finds "c-to-rust-verification-workflow", etc.
3. Content match (0.2× weight, capped at 50 hits) — search node
   content for terms that didn't match any key. Catches nodes whose
   keys are opaque but whose content is relevant.

Also adds prompt-based seeding to the hook pipeline: extract_query_terms
from the user's prompt and merge into the term set. Previously the hook
only seeded from transcript scanning (finding node keys as substrings
in conversation history), which meant fresh sessions or queries about
new topics produced no search results at all.
2026-03-10 00:41:08 -04:00
ProofOfConcept
2f896bca2c poc-hook: call memory-search on PostToolUse for chunk drip-feed
Wire up the PostToolUse handler to call memory-search --hook, passing
through the hook JSON on stdin. This drains pending context chunks
saved by the initial UserPromptSubmit load, delivering them one per
tool call until all chunks are delivered.
2026-03-09 17:57:04 -04:00
ProofOfConcept
d5554db6a8 memory-search: chunk context output for hook delivery
Claude Code's hook output limit (~10K chars) was truncating the full
context load. Split output into chunks at section boundaries, deliver
first chunk on UserPromptSubmit, save remaining chunks to disk for
drip-feeding on subsequent PostToolUse calls.

Two-pass algorithm: split at "--- KEY (group) ---" boundaries, then
merge adjacent small sections up to 9K per chunk. Separates session_id
guard (needed for chunk state) from prompt guard (needed only for
search), so PostToolUse events without a prompt can still pop chunks.
2026-03-09 17:56:59 -04:00
Kent Overstreet
32d17997af memory-search: fix returned-set deduplication and pre-seeded count
mark_returned() was append-only without checking if the key already
existed, causing duplicates to accumulate across hook invocations.
load_returned() then returned all entries including duplicates, which
made the returned count exceed the seen count, causing a u64 underflow
in the pre-seeded calculation.

Fix: check load_returned() before appending in mark_returned(), dedup
on read in load_returned(), and use saturating_sub for the pre-seeded
count as a safety net.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-09 17:15:24 -04:00
Kent Overstreet
6dc300fcf8 poc-hook: call memory-search internally on UserPromptSubmit
Spawn memory-search --hook as a subprocess, piping the hook input
JSON through stdin and printing its stdout. This ensures memory
context injection goes through the same hook whose output Claude
Code reliably persists, fixing the issue where memory-search as a
separate hook had its output silently dropped.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-09 17:07:16 -04:00
Kent Overstreet
c2f245740c transcript: extract JSONL backward scanner and compaction detection into library
Move JsonlBackwardIter and find_last_compaction() from
parse-claude-conversation into a shared transcript module. Both
memory-search and parse-claude-conversation now use the same robust
compaction detection: mmap-based backward scan, JSON parsing to
verify user-type message, content prefix check.

Replaces memory-search's old detect_compaction() which did a forward
scan with raw string matching on "continued from a previous
conversation" — that could false-positive on the string appearing
in assistant output or tool results.

Add parse-claude-conversation as a new binary for debugging what's
in the context window post-compaction.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-09 17:06:32 -04:00
Kent Overstreet
0e17ab00b0 store: handle DST gaps in epoch_to_local
chrono's timestamp_opt can return None during DST transitions.
Handle all three variants (Single, Ambiguous, None) instead of
unwrapping. For DST gaps, offset by one hour to land in valid
local time.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-09 17:02:29 -04:00
Kent Overstreet
53e6b32cb4 daemon: rework consolidation pipeline and add graph health metrics
Replace monolithic consolidate job with individual agent jobs
(replay, linker, separator, transfer, health) that run sequentially
and store reports. Multi-phase daily pipeline: agent runs → apply
actions → link orphans → cap degree → digest → digest links →
knowledge loop.

Add GraphHealth struct with graph metrics (alpha, gini, clustering
coefficient, episodic ratio) computed during health checks. Display
in `poc-memory daemon status`. Use cached metrics to build
consolidation plan without expensive O(n²) interference detection.

Add RPC consolidate command to trigger consolidation via socket.
Harden session watcher: skip transcripts with zero segments, improve
migration error handling.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-09 17:02:01 -04:00
ProofOfConcept
8eb6308760 experience-mine: per-segment dedup keys, retry backoff
The whole-file dedup key (_mined-transcripts#f-{UUID}) prevented mining
new compaction segments when session files grew. Replace with per-segment
keys (_mined-transcripts#f-{UUID}.{N}) so each segment is tracked
independently.

Changes:
- daemon session-watcher: segment-aware dedup, migrate 272 existing
  whole-file keys to per-segment on restart
- seg_cache with size-based invalidation (re-parse when file grows)
- exponential retry backoff (5min → 30min cap) for failed sessions
- experience_mine(): write per-segment key only, backfill on
  content-hash early return
- fact-mining gated on all per-segment keys existing

Also adds documentation:
- docs/claude-code-transcript-format.md: JSONL transcript format
- docs/plan-experience-mine-dedup-fix.md: design document
2026-03-09 02:27:51 -04:00
Kent Overstreet
1326a683a5 spread: separate traversal from ranking
Node weight no longer gates signal propagation — only edge_decay
and edge_strength affect traversal. Node weight is applied at the
end for ranking. This lets low-weight nodes serve as bridges
without killing the signal passing through them.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-09 01:38:33 -04:00
Kent Overstreet
05c7d55949 spread: simultaneous wavefront instead of independent BFS
All seeds emit at once. At each hop, activations from all sources
sum at each node, and the combined map propagates on the next hop.
Nodes where multiple wavefronts overlap get reinforced and radiate
stronger — natural interference patterns.

Lower default min_activation threshold (×0.1) since individual
contributions are smaller in additive mode.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-09 01:35:27 -04:00
Kent Overstreet
c13a9da81c manifold: fix direction initialization, add power iteration rounds
Initialize direction from the two most spectrally separated seeds
instead of relying on input order (which was alphabetical from
BTreeMap). Run 3 rounds of power iteration with normalization
instead of 1 for better convergence.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-09 01:27:24 -04:00
Kent Overstreet
01dd8e5ef9 search: add --full flag to show node content in results
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-09 01:25:42 -04:00
Kent Overstreet
63253f102a search: add confluence, geodesic, and manifold algorithms
Three new composable search stages:

  confluence — multi-source spreading activation. Unlike spread (which
  takes max from any source), confluence rewards nodes reachable from
  multiple seeds additively. Naturally separates unrelated seed groups
  since their neighborhoods don't overlap. Params: max_hops, edge_decay,
  min_sources.

  geodesic — straightest path between seed pairs in spectral space.
  At each graph hop, picks the neighbor whose spectral direction most
  aligns with the target (cosine similarity of direction vectors).
  Nodes on many geodesic paths score highest. Params: max_path, k.

  manifold — extrapolation along the direction seeds define. Computes
  weighted centroid + principal axis of seeds in spectral space, then
  scores candidates by projection onto that axis (penalized by
  perpendicular distance). Finds what's "further along" rather than
  "nearby." Params: k.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-09 01:22:29 -04:00
Kent Overstreet
c1664bf76b search: composable algorithm pipeline
Break search into composable stages that chain left-to-right:
each stage takes seeds Vec<(String, f64)> and returns modified seeds.

Available algorithms:
  spread              — spreading activation through graph edges
  spectral            — nearest neighbors in spectral embedding
  manifold            — (placeholder) extrapolation along seed direction

Stages accept inline params: spread,max_hops=4,edge_decay=0.5

memory-search gets --hook, --debug, --seen modes plus positional
pipeline args. poc-memory search gets -p/--pipeline flags.

Also: fix spectral decompose() to skip zero eigenvalues from
disconnected components, filter degenerate zero-coord nodes from
spectral projection, POC_AGENT bail-out for daemon agents, all
debug output to stdout.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-09 01:19:04 -04:00
ProofOfConcept
0a35a17fad use HashSet for orphan edge dedup, fix redundant type qualification
Replace O(n²) Vec::contains + sort/dedup with O(n) HashSet for orphan
node tracking in health_report(). Use imported HashMap type instead of
fully-qualified std::collections::HashMap.
2026-03-08 21:43:58 -04:00
ProofOfConcept
92f3ba5acf extract shared transcript parser and similarity matching helpers
- New agents/transcript.rs: shared JSONL parsing for enrich, fact_mine,
  and knowledge (was 3 separate implementations, ~150 lines duplicated)
- New best_match() and section_children() helpers in neuro/rewrite.rs
  (was duplicated find-best-by-similarity loop + section collection)
- Net -153 lines
2026-03-08 21:42:53 -04:00