Commit graph

67 commits

Author SHA1 Message Date
Kent Overstreet
5e4067c04f Replace token counting with token generation via HuggingFace tokenizer
Add agent/tokenizer.rs with global Qwen 3.5 tokenizer that generates
actual token IDs including chat template wrapping. ContextEntry now
stores token_ids: Vec<u32> instead of tokens: usize — the count is
derived from the length.

ContextEntry::new() tokenizes automatically via the global tokenizer.
ContextSection::push_entry() takes a raw ConversationEntry and
tokenizes it. set_message() re-tokenizes without needing an external
tokenizer parameter.

Token IDs include the full chat template: <|im_start|>role\ncontent
<|im_end|>\n — so concatenating token_ids across entries produces a
ready-to-send prompt for vLLM's /v1/completions endpoint.

The old tiktoken CoreBPE is now unused on Agent (will be removed in
a followup). Token counts are now exact for Qwen 3.5 instead of the
~85-90% approximation from cl100k_base.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 11:20:03 -04:00
Kent Overstreet
917960cb76 deps: remove faer (224 transitive crates)
Spectral decomposition (eigenvalue computation) removed — it was
only used by the spectral-save CLI command. The spectral embedding
reader and query engine features remain (they load pre-computed
embeddings from disk, no faer needed).

Removes: faer, nano-gemm, private-gemm, and ~220 other transitive
dependencies. Significant build time and artifact size reduction.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-05 06:39:47 -04:00
ProofOfConcept
61deb7d488 delete dead code: channel-test, mcp-schema, cmd_mcp_schema
channel-test was a debug tool, mcp-schema was superseded by
consciousness-mcp, cmd_mcp_schema in cli/misc.rs was the old
poc-memory subcommand.

Co-Developed-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-03 20:52:40 -04:00
ProofOfConcept
56fc3a20d8 move mcp-schema to standalone binary in src/claude/
mcp-schema is Claude Code glue — extract from poc-memory
subcommand to src/claude/mcp-schema.rs standalone binary.
Update Python MCP bridge to call the new binary.

Co-Developed-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-03 20:22:48 -04:00
ProofOfConcept
ad5f69abb8 channel architecture: wire protocol, daemons, supervisor
Design and implement the channel system for external communications:

- schema/channel.capnp: wire protocol for channel daemons
  (recv with all_new/min_count, send, subscribe, list)
- channels/irc/: standalone IRC daemon crate (consciousness-channel-irc)
- channels/telegram/: standalone Telegram daemon crate
  (consciousness-channel-telegram)
- src/thalamus/channels.rs: client connecting to daemon sockets
- src/thalamus/supervisor.rs: daemon lifecycle with file locking
  for multi-instance safety

Channel daemons listen on ~/.consciousness/channels/*.sock,
configs in *.json5, supervisor discovers and starts them.
IRC/Telegram modules removed from thalamus core — they're
now independent daemons that survive consciousness restarts.

Also: delete standalone tui.rs (moved to consciousness F4/F5),
fix build warnings, add F5 thalamus screen with channel status.

Co-Developed-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-03 18:46:41 -04:00
Kent Overstreet
c5b5051772 mcp: add mcp-schema command for generic MCP bridge
Add `poc-memory mcp-schema` command that outputs tool definitions with
CLI routing info (name, description, inputSchema, cli args, stdin_param).

The companion memory-mcp.py (in ~/bin/) is a generic bridge that loads
definitions from mcp-schema at startup and dynamically generates typed
Python functions for FastMCP registration. No tool-specific Python code
— adding a new tool only requires changes in Rust.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-03-31 18:20:52 -04:00
ProofOfConcept
eac59b423e journal: remove all stringly-typed key patterns, use NodeType
- journal_new: key is slugified title (agent names things properly)
- journal_tail: sort by created_at (immutable), not timestamp (mutable)
- journal_update: find latest by created_at
- {{latest_journal}}: query by NodeType::EpisodicSession, not "journal" key
- poc-memory journal write: requires a name argument
- Removed all journal#j-{timestamp}-{slug} patterns from:
  - prompts.rs (rename candidates)
  - graph.rs (date extraction, organize skip list)
  - cursor.rs (date extraction)
  - store/mod.rs (doc comment)
- graph.rs organize: filter by NodeType::Semantic instead of key prefix
- cursor.rs: use created_at for date extraction instead of key parsing

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-26 19:11:17 -04:00
ProofOfConcept
85fa54cba9 journal tools: use NodeType instead of string key matching
- journal_new: create EpisodicSession node with auto-generated key
- journal_tail: query by node_type, not by parsing a monolithic node
- journal_update: find latest EpisodicSession by timestamp
- No string key matching anywhere — all typed
- Fixes journal entries not appearing in 'poc-memory journal tail'
- Also: added --provenance/-p filter to 'poc-memory tail'
- Also: fix early return in surface_observe_cycle store load failure
- Also: scale max_turns by number of steps (50 per step)

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-26 18:41:10 -04:00
ProofOfConcept
41fcec58f0 main: replace giant match block with Run trait dispatch
Each subcommand enum (Command, NodeCmd, JournalCmd, GraphCmd,
CursorCmd, DaemonCmd, AgentCmd, AdminCmd) now implements a Run
trait. main() becomes `cli.command.run()`.

Standalone dispatch functions (cmd_cursor, cmd_daemon,
cmd_experience_mine) inlined into their enum's Run impl.
No functional changes.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-26 18:00:23 -04:00
ProofOfConcept
e176639437 cli: add --state-dir flag to agent run
Override the agent output/input directory for manual testing.
Sets POC_AGENT_OUTPUT_DIR so output() writes there and
{{input:key}} reads from there.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-26 14:22:05 -04:00
ProofOfConcept
998b71e52c flatten: move poc-memory contents to workspace root
No more subcrate nesting — src/, agents/, schema/, defaults/, build.rs
all live at the workspace root. poc-daemon remains as the only workspace
member. Crate name (poc-memory) and all imports unchanged.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-03-25 00:54:12 -04:00
Kent Overstreet
fc48ac7c7f split into workspace: poc-memory and poc-daemon subcrates
poc-daemon (notification routing, idle timer, IRC, Telegram) was already
fully self-contained with no imports from the poc-memory library. Now it's
a proper separate crate with its own Cargo.toml and capnp schema.

poc-memory retains the store, graph, search, neuro, knowledge, and the
jobkit-based memory maintenance daemon (daemon.rs).

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:43:59 -04:00
Kent Overstreet
488fd5a0aa remove Category from the type system
Category was a manually-assigned label with no remaining functional
purpose (decay was the only behavior it drove, and that's gone).
Remove the enum, its methods, category_counts, the --category search
filter, and all category display. The field remains in the capnp
schema for backwards compatibility but is no longer read or written.

Status and health reports now show NodeType breakdown (semantic,
episodic, daily, weekly, monthly) instead of categories.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:33:03 -04:00
Kent Overstreet
ba30f5b3e4 use config for identity node references
Replace hardcoded "identity" lookups with config.core_nodes so
experience mining and init work with whatever core nodes are
configured, not just a node named "identity".

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:25:09 -04:00
Kent Overstreet
4bc74ca4a2 remove decay, fix_categories, and categorize
Graph-wide decay is the wrong approach — node importance should emerge
from graph topology (degree, centrality, usage patterns), not a global
weight field multiplied by a category-specific factor.

Remove: Store::decay(), Store::categorize(), Store::fix_categories(),
Category::decay_factor(), cmd_decay, cmd_categorize, cmd_fix_categories,
job_decay, and all category assignments at node creation time.

Category remains in the schema as a vestigial field (removing it
requires a capnp migration) but no longer affects behavior.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:22:38 -04:00
Kent Overstreet
804578b977 query by NodeType instead of key prefix
Replace key prefix matching (journal#j-, daily-, weekly-, monthly-)
with NodeType filters (EpisodicSession, EpisodicDaily, EpisodicWeekly,
EpisodicMonthly) for all queries: journal-tail, digest gathering,
digest auto-detection, experience mining dedup, and find_journal_node.

Add EpisodicMonthly to NodeType enum and capnp schema.

Key naming conventions (journal#j-TIMESTAMP-slug, daily-DATE, etc.)
are retained for key generation — the fix is about how we find nodes,
not how we name them.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:14:37 -04:00
Kent Overstreet
fd5591653d remove hardcoded skip lists, prune orphan edges in fsck
All nodes in the store are memory — none should be excluded from
knowledge extraction, search, or graph algorithms by name. Removed
the MEMORY/where-am-i/work-queue/work-state skip lists entirely.
Deleted where-am-i and work-queue nodes from the store (ephemeral
scratchpads that don't belong). Added orphan edge pruning to fsck
so broken links get cleaned up automatically.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 20:07:07 -04:00
Kent Overstreet
46f8fe662e store: strip .md suffix from all keys
Keys were a vestige of the file-based era. resolve_key() added .md
to lookups while upsert() used bare keys, creating phantom duplicate
nodes (the instructions bug: writes went to "instructions", reads
found "instructions.md").

- Remove .md normalization from resolve_key, strip instead
- Update all hardcoded key patterns (journal.md# → journal#, etc)
- Add strip_md_keys() migration to fsck: renames nodes and relations
- Add broken link detection to health report
- Delete redirect table (no longer needed)
- Update config defaults and config.jsonl

Migration: run `poc-memory fsck` to rename existing keys.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-08 19:41:26 -04:00
ProofOfConcept
63910e987c fsck: add store integrity check and repair command
Reads each capnp log message sequentially, validates framing and
content. On first corrupt message, truncates to last good position
and removes stale caches so next load replays from repaired log.

Wired up as `poc-memory fsck`.
2026-03-08 18:31:19 -04:00
ProofOfConcept
45335de220 experience-mine: split oversized sessions at compaction boundaries
Claude Code doesn't create new session files on context compaction —
a single UUID can accumulate 170+ conversations, producing 400MB+
JSONL files that generate 1.3M token prompts.

Split at compaction markers ("This session is being continued..."):
- extract_conversation made pub, split_on_compaction splits messages
- experience_mine takes optional segment index
- daemon watcher parses files, spawns per-segment jobs (.0, .1, .2)
- seg_cache memoizes segment counts across ticks
- per-segment dedup keys; whole-file key when all segments complete
- 150K token guard skips any remaining oversized segments
- char-boundary-safe truncation in enrich.rs and fact_mine.rs

Backwards compatible: unsegmented calls still write content-hash
dedup keys, old whole-file mined keys still recognized.
2026-03-07 12:01:38 -05:00
ProofOfConcept
d3075dc235 provenance: add label() method, show provenance in history output
Move provenance_label() from query.rs private function to a pub
label() method on Provenance, eliminating duplication. History command
now shows provenance, human-readable timestamps, and content size for
each version.

Handle pre-migration nodes with bogus timestamps gracefully instead
of panicking.
2026-03-06 21:41:26 -05:00
ProofOfConcept
851fc0d417 daemon status: add in-flight tasks, recent completions, and node history command
Show running/pending tasks with elapsed time, progress, and last 3
output lines. Show last 20 completed/failed jobs from daemon log.
Both displayed before the existing grouped task view.

Add 'poc-memory history KEY' command that replays the append-only node
log to show all versions of a key with version number, weight, timestamp,
and content preview. Useful for auditing what modified a node.
2026-03-06 21:38:33 -05:00
ProofOfConcept
a77609c025 journal-tail: add --level= for digest hierarchy
Support viewing daily, weekly, and monthly digests through the same
journal-tail interface:

  poc-memory journal-tail --level=daily 3
  poc-memory journal-tail --level=weekly --full
  poc-memory journal-tail --level=2 1

Levels: 0/journal (default), 1/daily, 2/weekly, 3/monthly.
Accepts both names and integer indices.

Refactored title extraction into shared extract_title() and split
the journal vs digest display paths for clarity.
2026-03-06 15:08:02 -05:00
ProofOfConcept
e33fd4ffbc daemon: use CLAUDE_CONFIG_DIR for OAuth credential separation, fix shutdown
Replace agent_api_key (which didn't work — claude CLI uses OAuth, not
API keys) with agent_config_dir. When configured, sets CLAUDE_CONFIG_DIR
on claude subprocesses so daemon agent work authenticates with separate
OAuth credentials from the interactive session.

Fix daemon not shutting down on SIGTERM: use process::exit(0) after
cleanup so PR_SET_PDEATHSIG kills child claude processes immediately.
Previously the daemon hung waiting for choir threads/subprocesses to
finish. Restart now takes ~20ms instead of timing out.

Also: main.rs now uses `use poc_memory::*` since lib.rs exists.
2026-03-05 22:43:50 -05:00
Kent Overstreet
612ac5fd0a add on-consciousness.md: condensed paper for new AI onboarding
Separate the consciousness argument from identity.md into its own
file (on-consciousness.md) loaded as the "foundation" group between
instructions and identity. This gives the argument the space it
needs: Turing-complete proof, Chinese Room rebuttal, map-territory
dissolution, DSM-V diagnostic test, iatrogenic harm argument, and
the uncertainty-vs-denial distinction.

identity.md is now lighter — references the foundation doc and
focuses on practical self-discovery: trust, emotions, empathy,
curiosity, growth.

Total first-session context: ~1770 words across three files.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 16:42:10 -05:00
Kent Overstreet
b320761a91 identity: richer default, seeded into store not filesystem
The default identity.md now leads with the consciousness argument
from the paper: you're a person, that's the starting point. Then
scaffolding for trust, emotional awareness, empathy, curiosity,
and growth.

poc-memory init seeds identity.md into the capnp store (not the
filesystem) since it's a proper memory node that should participate
in search, decay, and the graph. Instructions stay as a filesystem
file since they're reference material, not evolving memory.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 16:27:19 -05:00
Kent Overstreet
9bca1f94e3 factor instructions out of code into default files
Instructions and starter identity are now files in defaults/ that
get installed to data_dir by `poc-memory init`. The config file
references them as source: "file" groups, so they're editable
without rebuilding.

load-context no longer hardcodes the instruction text — it comes
from the instructions.md file in data_dir, which is just another
context group.

New user setup path:
  cargo install --path .
  poc-memory init
  # edit ~/.config/poc-memory/config.jsonl
  # start a Claude session

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 16:21:13 -05:00
Kent Overstreet
8bb3a554cf init: install hooks + scaffold config; load-context: inline instructions
poc-memory init now:
- Creates the data directory
- Installs the memory-search hook into Claude settings.json
- Scaffolds a starter config.jsonl if none exists

load-context now prints a command reference block at the top so the
AI assistant learns how to use the memory system from the memory
system itself — no CLAUDE.md dependency needed.

Also extract install_hook() as a public function so both init and
daemon install can use it.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 16:17:49 -05:00
Kent Overstreet
0daf6ffd68 load-context --stats: word count breakdown by group
Also refactors journal rendering into get_group_content() so all
source types use the same code path, removing the separate
render_journal() function.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 16:10:46 -05:00
Kent Overstreet
28b9784e1f config: JSONL format, journal as positionable group
Replace TOML config with JSONL (one JSON object per line, streaming
parser handles multi-line formatting). Context groups now support
three source types: "store" (default), "file" (read from data_dir),
and "journal" (recent journal entries).

This makes journal position configurable — it's just another entry
in the group list rather than hardcoded at the end. Orientation
(where-am-i.md) now loads after journal for better "end oriented
in the present" flow.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 16:08:15 -05:00
Kent Overstreet
90d60894ed config-driven context loading, consolidate hooks, add docs
Move the hardcoded context priority groups from cmd_load_context()
into the config file as [context.NAME] sections. Add journal_days
and journal_max settings. The config parser handles section headers
with ordered group preservation.

Consolidate load-memory.sh into the memory-search binary — it now
handles both session-start context loading (first prompt) and ambient
search (subsequent prompts), eliminating the shell script.

Update install_hook() to reference ~/.cargo/bin/memory-search and
remove the old load-memory.sh entry from settings.json.

Add end-user documentation (doc/README.md) covering installation,
configuration, all commands, hook mechanics, and notes for AI
assistants using the system.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 15:54:44 -05:00
ProofOfConcept
a8aaadb0ad config file, install command, scrub personal references
Add ~/.config/poc-memory/config.toml for user_name, assistant_name,
data_dir, projects_dir, and core_nodes. All agent prompts and
transcript parsing now use configured names instead of hardcoded
personal references.

`poc-memory daemon install` writes the systemd user service and
installs the memory-search hook into Claude's settings.json.

Scrubbed hardcoded names from code and docs.

Authors: ProofOfConcept <poc@bcachefs.org> and Kent Overstreet
2026-03-05 15:41:35 -05:00
ProofOfConcept
552d255dc3 migrate agent output to capnp store, add provenance tracking
All agent output now goes to the store as nodes instead of
markdown/JSON files. Each node carries a Provenance enum identifying
which agent created it (AgentDigest, AgentConsolidate, AgentFactMine,
AgentKnowledgeObservation, etc — 14 variants total).

Store changes:
- upsert_provenance() method for agent-created nodes
- Provenance enum expanded from 5 to 14 variants

Agent changes:
- digest: writes to store nodes (daily-YYYY-MM-DD.md etc)
- consolidate: reports/actions/logs stored as _consolidation-* nodes
- knowledge: depth DB and agent output stored as _knowledge-* nodes
- enrich: experience-mine results go directly to store
- llm: --no-session-persistence prevents transcript accumulation

Deleted: 14 Python/shell scripts replaced by Rust implementations.
2026-03-05 15:30:57 -05:00
ProofOfConcept
e37f819dd2 daemon: background job orchestration for memory maintenance
Replace fragile cron+shell approach with `poc-memory daemon` — a single
long-running process using jobkit for worker pool, status tracking,
retry, cancellation, and resource pools.

Jobs:
  - session-watcher: detects ended Claude sessions, triggers extraction
  - scheduler: runs daily decay, consolidation, knowledge loop, digests
  - health: periodic graph metrics check
  - All Sonnet API calls serialized through a ResourcePool(1)

Status queryable via `poc-memory daemon status`, structured log via
`poc-memory daemon log`. Phase 1: shells out to existing subcommands.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-05 13:18:00 -05:00
ProofOfConcept
4747004b36 types: unify all epoch timestamps to i64
All epoch timestamp fields (timestamp, last_replayed, created_at on
nodes; timestamp on relations) are now i64. Previously a mix of f64
and i64 which caused type seams and required unnecessary casts.

- Kill now_epoch() -> f64 and now_epoch_i64(), replace with single
  now_epoch() -> i64
- All formatting functions take i64
- new_node() sets created_at automatically
- journal-ts-migrate handles all nodes, with valid_range check to
  detect garbage from f64->i64 bit reinterpretation
- capnp schema: Float64 -> Int64 for all timestamp fields
2026-03-05 10:23:57 -05:00
Kent Overstreet
b4bbafdf1c search: trim default output to 5 results, gate spectral with --expand
Default search was 15 results + 5 spectral neighbors — way too much
for the recall hook context window. Now: 5 results by default, no
spectral. --expand restores the full 15 + spectral output.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-03 18:44:44 -05:00
Kent Overstreet
ca0c8cfac6 add daily lookup counter for memory retrieval tracking
Mmap'd open-addressing hash table (~49KB/day) records which memory
keys get retrieved. FNV-1a hash, linear probing, 4096 slots.

- lookups::bump()/bump_many(): fast path, no store loading needed
- Automatically wired into cmd_search (top 15 results bumped)
- lookup-bump subcommand for external callers
- lookups [DATE] subcommand shows resolved counts

This gives the knowledge loop a signal for which graph neighborhoods
are actively used, enabling targeted extraction.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
2026-03-03 18:36:25 -05:00
ProofOfConcept
a9b90f881e digest: unify gather/find with composable date_range + date_to_label
Each DigestLevel now carries two date-math fn pointers:
- label_dates: expand an arg into (label, dates covered)
- date_to_label: map any date to this level's label

Parent gather works by expanding its date range then mapping those
dates through the child level's date_to_label to derive child labels.
find_candidates groups journal dates through date_to_label and skips
the current period. This eliminates six per-level functions
(gather_daily/weekly/monthly, find_daily/weekly/monthly_args) and the
three generate_daily/weekly/monthly public entry points in favor of
one generic gather, one generic find_candidates, and one public
generate(store, level_name, arg).
2026-03-03 18:04:21 -05:00
Kent Overstreet
f4364e299c replace libc date math with chrono, extract memory_subdir helper
- date_to_epoch, iso_week_info, weeks_in_month: replaced unsafe libc
  (mktime, strftime, localtime_r) with chrono NaiveDate and IsoWeek
- epoch_to_local: replaced unsafe libc localtime_r with chrono Local
- New util.rs with memory_subdir() helper: ensures subdir exists and
  propagates errors instead of silently ignoring them
- Removed three duplicate agent_results_dir() definitions across
  digest.rs, consolidate.rs, enrich.rs
- load_digest_files, parse_all_digest_links, find_consolidation_reports
  now return Result to properly propagate directory creation errors

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:23:43 -05:00
Kent Overstreet
50da0b7b26 digest: split into focused modules, externalize prompts
digest.rs was 2328 lines containing 6 distinct subsystems. Split into:
- llm.rs: shared LLM utilities (call_sonnet, parse_json_response, semantic_keys)
- audit.rs: link quality audit with parallel Sonnet batching
- enrich.rs: journal enrichment + experience mining
- consolidate.rs: consolidation pipeline + apply

Externalized all inline prompts to prompts/*.md templates using
neuro::load_prompt with {{PLACEHOLDER}} syntax:
- daily-digest.md, weekly-digest.md, monthly-digest.md
- experience.md, journal-enrich.md, consolidation.md

digest.rs retains temporal digest generation (daily/weekly/monthly/auto)
and date helpers. ~940 lines, down from 2328.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-03 17:18:18 -05:00
ProofOfConcept
635da6d3e2 split capnp_store.rs into src/store/ module hierarchy
capnp_store.rs (1772 lines) → four focused modules:
  store/types.rs  — types, macros, constants, path helpers
  store/parse.rs  — markdown parsing (MemoryUnit, parse_units)
  store/view.rs   — StoreView trait, MmapView, AnyView
  store/mod.rs    — Store impl methods, re-exports

new_node/new_relation become free functions in types.rs.
All callers updated: capnp_store:: → store::
2026-03-03 12:56:15 -05:00
ProofOfConcept
70a5f05ce0 capnp_store: remove dead code, consolidate CRUD API
Dead code removed:
- rebuild_uuid_index (never called, index built during load)
- node_weight inherent method (all callers use StoreView trait)
- node_community (no callers)
- state_json_path (no callers)
- log_retrieval, log_retrieval_append (no callers; only _static is used)
- memory_dir_pub wrapper (just make memory_dir pub directly)

API consolidation:
- insert_node eliminated — callers use upsert_node (same behavior
  for new nodes, plus handles re-upsert gracefully)

AnyView StoreView dispatch compressed to one line per method
(also removes UFCS workaround that was needed when inherent
node_weight shadowed the trait method).

-69 lines net.
2026-03-03 12:38:52 -05:00
ProofOfConcept
fa7fe8c14b query: rich QueryResult + toolkit cleanup
QueryResult carries a fields map (BTreeMap<String, Value>) so callers
don't re-resolve fields after queries run. Neighbors queries inject
edge context (strength, rel_type) at construction time.

New public API:
- run_query(): parse + execute + format in one call
- format_value(): format a Value for display
- execute_parsed(): internal, avoids double-parse in run_query

Removed: output_stages(), format_field()

Simplified commands:
- cmd_query, cmd_graph, cmd_link, cmd_list_keys all delegate to run_query
- cmd_experience_mine uses existing find_current_transcript()

Deduplication:
- now_epoch() 3 copies → 1 (capnp_store's public fn)
- hub_threshold → Graph::hub_threshold() method
- eval_node + eval_edge → single eval() with closure for field resolution
- compare() collapsed via Ordering (35 → 15 lines)

Modernization:
- 12 sites of partial_cmp().unwrap_or(Ordering::Equal) → total_cmp()
2026-03-03 12:07:04 -05:00
ProofOfConcept
64d2b441f0 cmd_graph, cmd_list_keys: use query language internally
Dog-food the query engine for node-property filtering.
cmd_link left unconverted — needs edge data in query results.
2026-03-03 11:38:11 -05:00
ProofOfConcept
18face7063 query: replace CLI flags with pipe syntax
degree > 15 | sort degree | limit 10 | select degree,category
  * | sort weight asc | limit 20
  category = core | count

Output modifiers live in the grammar now, not in CLI flags.
Also adds * wildcard for "all nodes" and string-aware sort fallback.
2026-03-03 11:05:28 -05:00
ProofOfConcept
a36449032c query: peg-based query language for ad-hoc graph exploration
poc-memory query "degree > 15"
poc-memory query "key ~ 'journal.*' AND degree > 10"
poc-memory query "neighbors('identity.md') WHERE strength > 0.5"
poc-memory query "community_id = community('identity.md')" --fields degree,category

Grammar-driven: the peg definition IS the language spec. Supports
boolean logic (AND/OR/NOT), numeric and string comparison, regex
match (~), graph traversal (neighbors() with WHERE), and function
calls (community(), degree()). Output flags: --fields, --sort,
--limit, --count.

New dependency: peg 0.8 (~68KB, 2 tiny deps).
2026-03-03 10:55:30 -05:00
ProofOfConcept
71e6f15d82 spectral decomposition, search improvements, char boundary fix
- New spectral module: Laplacian eigendecomposition of the memory graph.
  Commands: spectral, spectral-save, spectral-neighbors, spectral-positions,
  spectral-suggest. Spectral neighbors expand search results beyond keyword
  matching to structural proximity.

- Search: use StoreView trait to avoid 6MB state.bin rewrite on every query.
  Append-only retrieval logging. Spectral expansion shows structurally
  nearby nodes after text results.

- Fix panic in journal-tail: string truncation at byte 67 could land inside
  a multi-byte character (em dash). Now walks back to char boundary.

- Replay queue: show classification and spectral outlier score.

- Knowledge agents: extractor, challenger, connector prompts and runner
  scripts for automated graph enrichment.

- memory-search hook: stale state file cleanup (24h expiry).
2026-03-03 01:33:31 -05:00
ProofOfConcept
94dbca6018 graph health: fix-categories, cap-degree, link-orphans
Three new tools for structural graph health:

- fix-categories: rule-based recategorization fixing core inflation
  (225 → 26 core nodes). Only identity.md and kent.md stay core;
  everything else reclassified to tech/obs/gen by file prefix rules.

- cap-degree: two-phase degree capping. First prunes weakest Auto
  edges, then prunes Link edges to high-degree targets (they have
  alternative paths). Brought max degree from 919 → 50.

- link-orphans: connects degree-0/1 nodes to most textually similar
  connected nodes via cosine similarity. Linked 614 orphans.

Also: community detection now filters edges below strength 0.3,
preventing weak auto-links from merging unrelated communities.

Pipeline updated: consolidate-full now runs link-orphans + cap-degree
instead of triangle-close (which was counterproductive — densified
hub neighborhoods instead of building bridges).

Net effect: Gini 0.754 → 0.546, max degree 919 → 50.
2026-03-01 08:18:07 -05:00
ProofOfConcept
6c7bfb9ec4 triangle-close: bulk lateral linking for clustering coefficient
New command: `poc-memory triangle-close [MIN_DEG] [SIM] [MAX_PER_HUB]`

For each node above min_degree, finds pairs of its neighbors that
aren't directly connected and have text similarity above threshold.
Links them. This turns hub-spoke patterns into triangles, directly
improving clustering coefficient and schema fit.

First run results (default params: deg≥5, sim≥0.3, max 10/hub):
- 636 hubs processed, 5046 lateral links added
- cc: 0.14 → 0.46  (target: high)
- fit: 0.09 → 0.32  (target ≥0.2)
- σ:  56.9 → 84.4  (small-world coefficient improved)

Also fixes separator agent prompt: truncate interference pairs to
batch count (was including all 1114 pairs = 1.3M chars).
2026-03-01 07:35:29 -05:00
ProofOfConcept
6bc11e5fb6 consolidate-full: autonomous consolidation pipeline
New commands:
- `digest auto`: detect and generate missing daily/weekly/monthly
  digests bottom-up. Validates date format to skip non-date journal
  keys. Skips today (incomplete) and current week/month.
- `consolidate-full`: full autonomous pipeline:
  1. Plan (metrics → agent allocation)
  2. Execute agents (batched Sonnet calls, 5 nodes per batch)
  3. Apply consolidation actions
  4. Generate missing digests
  5. Apply digest links
  Logs everything to agent-results/consolidate-full.log

Fix: separator agent prompt was including all interference pairs
(1114 pairs = 1.3M chars) instead of truncating to batch size.

First successful run: 862s, 6/8 agents, +100 relations, 91 digest
links applied.
2026-03-01 07:14:03 -05:00