- Strip context bloat from nudge messages — no more IRC digest, git
log, or work state inlined into tmux send-keys (was silently dropping
the entire message). Nudge now just includes pending notification count.
- Notifications no longer send directly via tmux — they flow through
the idle nudge only. Urgent notifications reset the fired flag so
the nudge fires sooner.
- Add test-nudge RPC that exercises the actual daemon send path
(test-send was client-side only, didn't test the real code path).
- Update nudge text: "Let your feelings guide your thinking."
- Increase send-keys sleep from 200ms to 500ms.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Agent identity injection: prepend core-personality to all agent prompts
so agents dream as me, not as generic graph workers. Include instructions
to walk the graph and connect new nodes to core concepts.
- Parallel agent scheduling: sequential within type, parallel across types.
Different agent types (linker, organize, replay) run concurrently.
- Linker prompt: graph walking instead of keyword search for connections.
"Explore the local topology and walk the graph until you find the best
connections."
- memory-search fixes: format_results no longer truncates to 5 results,
pipeline default raised to 50, returned file cleared on compaction,
--seen and --seen-full merged, compaction timestamp in --seen output,
max_entries=3 per prompt for steady memory drip.
- Stemmer optimization: strip_suffix now works in-place on a single String
buffer instead of allocating 18 new Strings per word. Note for future:
reversed-suffix trie for O(suffix_len) instead of O(n_rules).
- Transcript: add compaction_timestamp() for --seen display.
- Agent budget configurable (default 4000 from config).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The graph changes fast with 1000+ agents per cycle. Daily was too
slow for the feedback loop. 6-hour cycle means Elo evaluation and
agent reallocation happen 4x per day.
Runs on first tick after daemon start (initialized to past).
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
agent_budget config (default 1000) replaces health-metric-computed
totals. The budget is the total agent runs per cycle — use it all.
Elo distribution is squared for power-law unfairness: top-rated agents
get disproportionately more runs. If linker has Elo 1123 and connector
has 876, linker gets ~7x more runs (squared ratio) vs ~3.5x (linear).
Minimum 2 runs per type so underperformers still get evaluated.
No Elo file → equal distribution as fallback.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
After health metrics compute the total agent budget, read
agent-elo.json and redistribute proportionally to Elo ratings.
Higher-rated agent types get more runs.
Health determines HOW MUCH work. Elo determines WHAT KIND.
Every type gets at least 1 run. If no Elo file exists, falls back
to the existing hardcoded allocation.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
The LCG was producing only 2 distinct matchup pairs due to poor
constants. Switch to xorshift32 for proper coverage of all type pairs.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Replace sort-based ranking with proper Elo system:
- Each agent TYPE has a persistent Elo rating (agent-elo.json)
- Each matchup: pick two random types, grab a recent action from
each, LLM compares, update ratings
- Ratings persist across daily evaluations — natural recency bias
from continuous updates against current opponents
- K=32 for fast adaptation to prompt changes
Usage: poc-memory agent evaluate --matchups 30 --model haiku
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
TIE causes inconsistency in sort (A=B, B=C but A>C breaks ordering).
Force the comparator to always pick a winner. Default to A if response
is unparseable.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
- Use CARGO_MANIFEST_DIR for agent file path (same as defs.rs)
- Dedup affected nodes extracted from reports
- --dry-run shows example comparison prompt without LLM calls
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Chain-of-thought: "say which is better and why" forces clearer
judgment and gives us analysis data for improving agents.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
When both actions are from the same agent, show the instructions once
and just compare the two report outputs + affected nodes. Saves tokens
and makes the comparison cleaner.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Each comparison now shows the LLM:
- Agent instructions (the .agent prompt file)
- Report output (what the agent did)
- Affected nodes content (what it changed)
The comparator sees intent, action, and impact — can judge whether
a deletion was correct, whether links are meaningful, whether
WRITE_NODEs capture real insights.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Yes, really. Rust's stdlib sort_by with an LLM pairwise comparator.
Each comparison is an API call asking "which action was better?"
Sample N actions per agent type, throw them all in a Vec, sort.
Where each agent's samples cluster = that agent's quality score.
Reports per-type average rank and quality ratio.
Supports both haiku (fast/cheap) and sonnet (quality) as comparator.
Usage: poc-memory agent evaluate --samples 5 --model haiku
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Evaluate agent will use sort-based ranking (LLM as merge sort
comparator) instead of absolute scoring. Stub for now — needs
Rust sampling code to bundle before/after pairs.
Fixed distill query: sort:degree (not sort:degree desc).
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Walks high-degree hub nodes, reads neighborhood, distills essential
insights upward into the hub. REFINE to update stale hubs, SPLIT
to flag hubs that cover too many sub-topics. Size discipline:
200-500 words per hub, flag over 800 for splitting.
Completes the agent ecology: extract (experience) → link (linker) →
organize (clusters) → distill (hubs) → rename (vocabulary) → split
(overgrown hubs). Each stage refines the previous.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Add "core principle: keys are concepts" — renaming defines the
vocabulary of the knowledge graph. Core keywords should be the
search terms. Updated examples to use dash separator (no more #).
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
search.rs → query/engine.rs (algorithms, pipeline, seed matching)
query.rs → query/parser.rs (PEG query language, field resolution)
query/mod.rs re-exports for backwards compatibility.
crate::search still works (aliased to query::engine).
crate::query::run_query resolves to the parser's entry point.
No logic changes — pure file reorganization.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Organize runs at half the linker count — synthesizes what linker
connects, creates hub nodes for unnamed concepts.
Connector runs when communities are fragmented (<5 nodes/community
→ 20 runs, <10 → 10 runs). Bridges isolated clusters.
Both interleaved round-robin with existing agent types.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Tell linker and organize agents to:
- Name unnamed concepts: when 3+ nodes share a theme with no hub,
create one with WRITE_NODE that synthesizes the generalization
- Percolate up: gather key insights from children into hub content,
so the hub is self-contained without needing to follow every link
This addresses the gap where agents are good at extraction and linking
but not synthesis — turning episodic observations into semantic concepts.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Make 'created' resolve to created_at epoch (numeric, sortable) and add
'timestamp' field. Enables `sort created desc` and `sort created asc`
in query pipelines.
Example: poc-memory query "key ~ 'bcachefs' | sort created desc | limit 10"
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Update the experience mining prompt to output links alongside journal
entries. The LLM now returns a "links" array per entry pointing to
existing semantic nodes. Rust code creates the links immediately after
node creation — new nodes arrive pre-connected instead of orphaned.
Also: remove # from all key generation paths (experience miner,
digest section keys, observed transcript keys). New nodes get clean
dash-separated keys.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Remove all the quoting instructions, warnings about shell comments,
and "CRITICAL" blocks about single quotes. Keys are plain dashes now.
Agent tool examples are clean and minimal.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Add `poc-memory admin bulk-rename FROM TO [--apply]` for bulk key
character replacement. Uses rename_node() per key for proper capnp
log persistence. Collision detection, progress reporting, auto-fsck.
Applied: renamed 13,042 keys from # to - separator. This fixes the
Claude Bash tool's inability to pass # in command arguments (the
model confabulates that quoting doesn't work and gives up).
7 collision pairs resolved by deleting the # version before rename.
209 orphan edges pruned by fsck.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
The agent was confabulating that # keys can't be passed to the Bash
tool. They work fine with single quotes — the agent just gave up too
early. Added explicit "single quotes WORK, do not give up" with a
concrete example.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Linker agents output **LINK** (bold) with backtick-wrapped keys, and
**WRITE_NODE**/**END_NODE** with bold markers. The parsers expected
plain LINK/WRITE_NODE without markdown formatting, silently dropping
all actions from tool-enabled agents.
Updated regexes to accept optional ** bold markers and backtick key
wrapping. Also reverted per-link Jaccard computation (too expensive
in batch) — normalize-strengths should be run periodically instead.
This was causing ~600 links and ~40 new semantic nodes per overnight
batch to be silently lost.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Agent subprocess calls now set POC_PROVENANCE=agent:{name} so any
nodes/links created via tool calls are tagged with the creating agent.
This makes agent transcripts indistinguishable from conscious sessions
in format — important for future model training.
new_relation() now reads POC_PROVENANCE env var directly (raw string,
not enum) since agent names are dynamic.
link-add now computes initial strength from Jaccard similarity instead
of hardcoded 0.8. New links start at a strength reflecting actual
neighborhood overlap.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Some Sonnet runs preemptively refuse to use tools ("poc-memory tool
needs approval") without attempting to run them. Adding explicit
instruction that tools are pre-approved and should be used directly.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Add adjust_edge_strength() to Store — modifies strength on all edges
between two nodes, clamped to [0.05, 0.95].
New commands:
- `not-relevant KEY` — weakens ALL edges to the node by 0.01
(bad routing: search found the wrong thing)
- `not-useful KEY` — weakens node weight, not edges
(bad content: search found the right thing but it's not good)
Enhanced `used KEY` — now also strengthens all edges to the node by
0.01, in addition to the existing node weight boost.
Three-tier design: agents adjust by 0.00001 (automatic), conscious
commands adjust by 0.01 (deliberate), manual override sets directly.
All clamped, never hitting 0 or 1.
Design spec: .claude/analysis/2026-03-14-link-strength-feedback.md
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Add jaccard() and jaccard_strengths() to Graph. Jaccard similarity
measures neighborhood overlap between linked nodes — nodes sharing
many neighbors get stronger links, nodes with no shared neighbors
get weak links.
New subcommand: `poc-memory graph normalize-strengths [--apply]`
Scales raw Jaccard (typically 0.0-0.3) to useful range via j*3
clamped to [0.1, 1.0]. Skips implicit temporal edges (strength=1.0).
Applied to 64,969 edges. Distribution is bimodal: large cluster at
0.1-0.2 (weak) and spike at 0.9-1.0 (strong), with smooth gradient
between. Replaces the meaningless 0.3/0.8 split from manual/agent
creation methods.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Create jobkit-daemon crate with generic daemon infrastructure:
- event_log: JSONL append with size-based rotation
- socket: Unix domain socket RPC client and server with signal handling
- status: JSON status file read/write
Migrate daemon.rs to use the library:
- Worker pool setup via Daemon::new()
- Socket loop + signal handling via Daemon::run()
- RPC handlers as registered closures
- Logging, status writing, send_rpc all delegate to library
Migrate tui.rs to use socket::send_rpc() instead of inline UnixStream.
daemon.rs: 1952 → 1806 lines (-146), old status_socket_loop removed.
tui.rs: socket boilerplate removed.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Linker: give it Bash(poc-memory:*) tools so it can render nodes,
query neighbors, and search before creating. Adds search-before-create
discipline to reduce redundant node creation.
Organize: remove MERGE operation, make DELETE conservative (only true
duplicates or garbage). Add "Preserve diversity" rule — multiple nodes
on similar topics are features, not bugs. LINK is primary operation.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Rebalance consolidation scoring to be linker-heavy:
- 50 replay + 100 linker for extreme hub dominance (was 10+5)
- High gini now adds linker instead of replay
- Agent runs interleave types round-robin (linker, replay, linker...)
instead of running all of one type then all of another
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Compute parent/child (session→daily→weekly→monthly) and prev/next
(chronological ordering within each level) edges at graph build time
from node metadata. Parse dates from keys for digest nodes (whose
timestamps reflect creation, not covered date) and prefer key-parsed
dates over timestamp-derived dates for sessions (timezone fix).
Result: ~9185 implicit edges, communities halved, gini improved.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Previously the organize agent received a pre-computed cluster from a
term search — 69% of runs produced 0 actions because the same clusters
kept being found via different entry points.
Now: seed nodes shown with content previews and neighbor lists. Agent
uses tools (render, query neighbors, search) to explore outward and
discover what needs organizing. Visit filter set to 24h cooldown.
Prompt rewritten to encourage active exploration rather than static
cluster analysis.
Persistent cursor into the knowledge graph with navigation:
- temporal: forward/back among same-type nodes by timestamp
- hierarchical: up/down the digest tree (journal→daily→weekly→monthly)
- spatial: graph neighbor display at every position
The cursor file (~/.claude/memory/cursor) holds a single node key.
Show displays: temporal arrows, hierarchy links, semantic neighbors,
and full content. Date extraction from both timestamps and key names
handles the mixed-timestamp data gracefully.
This is the start of place cells — spatial awareness of position
in your own knowledge.
When generating a digest, automatically link all source entries to the
digest node (journal entries → daily, dailies → weekly, weeklies →
monthly). This builds the temporal spine of the graph — previously
~4000 journal entries were disconnected islands unreachable by recall.
Rewrote digest prompt to produce narrative rather than reports:
capture the feel, the emotional arc, what it was like to live through
it. Letter to future self, not a task log.
Moved prompt to digest.agent file alongside other agent definitions.
Falls back to prompts/digest.md if agent file not found.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Keys containing # are now pre-quoted in all cluster output (similarity
scores, hub analysis, node headers) so the agent copies them correctly
into bash commands. Prompt strengthened with CRITICAL warning about #
being a shell comment character.
Journal entries included in clusters but identified by node_type
(EpisodicSession) rather than key prefix, and tagged [JOURNAL — no
delete] in the output. Prompt rule 3b tells agent to LINK/REFINE
journals but never DELETE them. Digest nodes (daily/weekly/monthly)
still excluded entirely from clusters.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Add progress callback to run_one_agent and run_and_apply so callers
can see: prompt size, node list, LLM call timing, parsed action
count, and per-action applied/skipped status. Daemon writes these
to the persistent event log via log_event.
Cap organize cluster to 20 nodes - 126 nodes produced a 682KB
prompt that timed out every time. Agent has tools to explore
further if needed. Restore general query for production runs.
Previous prompt was too documentation-heavy — agent pattern-matched
on example placeholders instead of doing actual work. New prompt:
structured as direct instructions, uses {{organize}} placeholder
for pre-computed cluster data, three clear decision paths (merge,
differentiate, keep both), numbered rules.
Convert daemon from hand-rolled string dispatch to proper clap
Subcommand enum with typed args. Add custom top-level help that
expands nested subcommands (same pattern as bcachefs-tools), so
`poc-memory --help` shows full paths like `agent daemon run`.
Add call_for_def() that threads model and tools from agent definitions
through to claude CLI. Tool-enabled agents get --allowedTools instead
of --tools "" and a longer 15-minute timeout for multi-turn work.
Add ActionKind::Delete with parse/apply support so agents can delete
nodes (used by organize agent for deduplication).
Use call_for_def() in run_one_agent instead of hardcoded call_sonnet.
Add `poc-memory graph organize TERM` diagnostic that finds nodes
matching a search term, computes pairwise cosine similarity, reports
connectivity gaps, and optionally creates anchor nodes.
Add organize.agent definition that uses Bash(poc-memory:*) tool access
to explore clusters autonomously — query selects highest-degree
unvisited nodes, agent drives its own iteration via poc-memory CLI.
Add {{organize}} placeholder in defs.rs for inline cluster resolution.
Add `tools` field to AgentDef/AgentHeader so agents can declare
allowed tool patterns (passed as --allowedTools to claude CLI).
Two changes:
1. New -q/--query flag for direct search without hook machinery.
Useful for debugging: memory-search -q inner-life-sexuality-intimacy
shows seeds, spread results, and rankings.
2. Prompt key boost: when the current prompt contains a node key
(>=5 chars) as a substring, boost that term by +10.0. This ensures
explicit mentions fire as strong seeds for spread, while the graph
still determines what gets pulled in.
Co-Authored-By: ProofOfConcept <poc@bcachefs.org>