Sets up vllm with Qwen 2.5 27B Instruct, prefix caching enabled,
Hermes tool call parser for function calling support. Configurable
via environment variables (MODEL, PORT, MAX_MODEL_LEN).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Make ApiClient a process-wide singleton via OnceLock so the
connection pool is reused across agent calls. Fix the sync wrapper
to properly pass the caller's log closure through thread::scope
instead of dropping it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Run the async API call on a dedicated thread with its own tokio
runtime so it works whether called from a sync context or from
within an existing tokio runtime (daemon).
Also drops the log closure capture issue — uses a simple eprintln
fallback since the closure can't cross thread boundaries.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When api_base_url is configured, agents call the LLM directly via
OpenAI-compatible API (vllm, llama.cpp, etc.) instead of shelling
out to claude CLI. Implements the full tool loop: send prompt, if
tool_calls execute them and send results back, repeat until text.
This enables running agents against local/remote models like
Qwen-27B on a RunPod B200, with no dependency on claude CLI.
Config fields: api_base_url, api_key, api_model.
Falls back to claude CLI when api_base_url is not set.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jobkit-daemon is now an external git dependency with its own repo.
The local clone was only needed temporarily to fix a broken
Cargo.toml in the remote.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lists nodes that are currently deleted with no subsequent live version.
Useful for diagnosing accidental deletions in the memory store.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split poc-agent into lib + bin so its API client, types, and tool
dispatch can be imported by poc-memory. This is the foundation for
replacing claude CLI subprocess calls with direct API calls to
vllm/OpenAI-compatible endpoints.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move poc-agent (substrate-independent AI agent framework) into the
memory workspace as a step toward using its API client for direct
LLM calls instead of shelling out to claude CLI.
Agent prompt improvements:
- distill: rewrite from hub-focused to knowledge-flow-focused.
Now walks upward from seed nodes to find and refine topic nodes,
instead of only maintaining high-degree hubs.
- distill: remove "don't touch journal entries" restriction
- memory-instructions-core: add "Make it alive" section — write
with creativity and emotional texture, not spreadsheet summaries
- memory-instructions-core: add "Show your reasoning" section —
agents must explain decisions, especially when they do nothing
- linker: already had emotional texture guidance (kept as-is)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Edition 2024 changes:
- gen is reserved: rename variable in query/engine.rs
- set_var is unsafe: wrap in unsafe block in cli/agent.rs
- match ergonomics: add explicit & in spectral.rs filter closure
New --local flag for `poc-memory agent run` bypasses the daemon and
runs the agent directly in-process. Useful for testing agent prompt
changes without waiting in the daemon queue.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The POC_PROVENANCE env var lookup was duplicated in upsert,
delete_node, and rename_node. Extract to a single function.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
delete_node and rename_node were cloning the previous node version
for the tombstone/rename entry without updating provenance or
timestamp. This made it impossible to tell who deleted a node or
when — the tombstone just inherited whatever the last write had.
Now both operations derive provenance from POC_PROVENANCE env var
(same as upsert) and set timestamp to now.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
cmd_history was silently hiding the deleted flag, making it
impossible to tell from the output that a node had been deleted.
This masked the kernel-patterns deletion — looked like the node
existed in the log but wouldn't load.
Also adds merge-logs and diag-key diagnostic binaries, and makes
Node::to_capnp public for use by external tools.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
rewrite_store() used File::create() to truncate and overwrite the
entire nodes.capnp log with only the latest version of each node
from the in-memory store. This destroyed all historical versions
and made no backup. Worse, any node missing from the in-memory
store due to a loading bug would be permanently lost.
strip_md_keys() now appends migrated nodes to the existing log
instead of rewriting it. The dead function is kept with a warning
comment explaining what went wrong.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
calibrate.agent: Haiku-based agent that reads a node and all its
neighbors, then assigns appropriate link strengths relative to each
other. Designed for high-volume runs across the whole graph.
graph link-set: Set strength of an existing link (0.0-1.0).
dominating-set query stage: Greedy 3-covering dominating set — finds
the minimum set of nodes such that every node in the input is within
1 hop of at least 3 selected nodes. Use with calibrate agent to
ensure every link gets assessed from multiple perspectives.
Usage: poc-memory query "content ~ 'bcachefs' | dominating-set"
--target and --query now queue individual daemon tasks instead of
running sequentially in the CLI. Each node gets its own choir task
with LLM resource locking. Falls back to local execution if daemon
isn't running.
RPC extended: "run-agent linker 1 target:KEY" spawns a targeted task.
Run an agent on nodes matching a query:
poc-memory agent run linker --query 'key ~ "bcachefs" | limit 10'
Resolves the query to node keys, then passes all as seeds to the agent.
For large batches, should be queued to daemon (future work).
experience_mine and journal_enrich are replaced by the observation
agent. enrich.rs reduced from 465 to 40 lines — only extract_conversation
and split_on_compaction remain (used by observation fragment selection).
-455 lines.
Remove unused StoreView imports, unused store imports, dead
install_default_file, dead make_report_slug, dead fact-mine/
experience-mine spawning loops in daemon. Fix mut warnings.
Zero compiler warnings now.
All 12 agents with WRITE_NODE/REFINE/END_NODE output format blocks
now rely on tool calls (poc-memory write/link-add/etc) via the
Bash(poc-memory:*) tool. Guidelines preserved, format sections removed.
Also changed linker query from type:episodic to all nodes — it was
missing semantic nodes entirely, which is why skills-bcachefs-* nodes
were never getting linked to their hubs.
Adds run_one_agent_with_keys() which bypasses the agent's query and
uses explicitly provided node keys. This allows testing agents on
specific graph neighborhoods:
poc-memory agent run linker --target bcachefs --debug
New placeholder resolves to the 20 highest-degree nodes, skipping
neighbors of already-selected hubs so the list covers different
regions of the graph. Gives agents a starting point for linking
new content to the right places.
Added to observation.agent prompt.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Large conversation segments are now split into 50KB chunks with 10KB
overlap, instead of being truncated to 8000 chars (which was broken
anyway — broke after exceeding, not before). Each chunk gets its own
candidate ID for independent mining and dedup.
format_segment simplified: no size limit, added timestamps to output
so observation agent can cross-reference with journal entries.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidate agent logging to one file per run in llm-logs/{agent}/.
Prompt written before LLM call, response appended after. --debug
additionally prints the same content to stdout.
Remove duplicate eprintln! calls and AgentResult.prompt field.
Kill experience_mine and fact_mine job functions from daemon —
observation.agent handles all transcript mining.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Empty stdout and Claude's rate limit message were silently returned
as successful 0-byte responses. Now detected and reported as errors.
Also skip transcript segments with fewer than 2 assistant messages
(rate-limited sessions, stub conversations).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add --debug flag that prints the full prompt and LLM response to
stdout, making it easy to iterate on agent prompts. Also adds
prompt field to AgentResult so callers can inspect what was sent.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Raw agent responses were being stored as nodes in the graph
(_consolidate-*, _knowledge-*), creating thousands of nodes per day
that polluted search results and bloated the store. Now logged to
~/.claude/memory/llm-logs/<agent>/<timestamp>.txt instead.
Node creation should only happen through explicit agent actions
(WRITE_NODE, REFINE) or direct poc-memory write tool calls.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New command: `poc-memory agent run <agent> [--count N] [--dry-run]`
Runs a single agent by name through the full pipeline (build prompt,
call LLM, apply actions). With --dry-run, sets POC_MEMORY_DRY_RUN=1
so all mutations are no-ops but the agent can still read the graph.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All mutating commands (write, delete, rename, link-add, journal write,
used, wrong, not-useful, gap) check POC_MEMORY_DRY_RUN after argument
validation but before mutation. If set, process exits silently — agent
tool calls are visible in the LLM output so we can see what it tried
to do without applying changes.
Read commands (render, search, graph link, journal tail) work normally
in dry-run mode so agents can still explore the graph.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update the observation agent prompt to:
- Check the journal around transcript timestamps before extracting
- Link extractions back to relevant journal entries
- Use poc-memory tools directly (search, render, write, link-add)
- Prefer REFINE over WRITE_NODE
- Simplified and focused prompt
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wire select_conversation_fragments to use store.is_segment_mined()
instead of scanning _observed-transcripts stub nodes. Segments are
now marked AFTER the agent succeeds (via mark_observation_done),
not before — so failed runs don't lose segments.
Fragment IDs flow through the Resolved.keys → AgentBatch.node_keys
path so run_and_apply_with_log can mark them post-success.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add TranscriptSegment capnp schema and append-only log for tracking
which transcript segments have been mined by which agents. Replaces
the old approach of creating stub nodes (_observed-transcripts,
_mined-transcripts, _facts-) in the main graph store.
- New schema: TranscriptSegment and TranscriptProgressLog
- Store methods: append_transcript_progress, replay, is_segment_mined,
mark_segment_mined
- Migration command: admin migrate-transcript-progress (migrated 1771
markers, soft-deleted old stub nodes)
- Progress log replayed on all Store::load paths
Also: revert extractor.agent to graph-only (no CONVERSATIONS),
update memory-instructions-core with refine-over-create principle.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extractor is a graph neighborhood organizer, not a transcript miner.
Remove {{CONVERSATIONS}} that was incorrectly merged in. Keep the
new includes (core-personality, memory-instructions-core) and tools.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add distill_count to ConsolidationPlan, daemon health metrics,
and TUI display. Distill agent now participates in the
consolidation budget alongside replay, linker, separator,
transfer, organize, and connector.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Was 130k, calibrated for the old 200k window. With the 1M token
context window, this was firing false compaction warnings for the
entire session.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All 17 agents now include {{node:core-personality}} and
{{node:memory-instructions-core}} instead of duplicating tool
blocks and graph walk instructions in each file. Stripped
duplicated tool/navigation sections from linker, organize,
distill, and evaluate. All agents now have Bash(poc-memory:*)
tool access for graph walking.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add {{node:KEY}} placeholder resolver — agents can inline any graph
node's content in their prompts. Used for shared instructions.
- Remove hardcoded identity preamble from defs.rs — agents now pull
identity and instructions from the graph via {{node:core-personality}}
and {{node:memory-instructions-core}}.
- Agent output report keys now include a content slug extracted from
the first line of LLM output, making them human-readable
(e.g. _consolidate-distill-20260316T014739-distillation-run-complete).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Strip context bloat from nudge messages — no more IRC digest, git
log, or work state inlined into tmux send-keys (was silently dropping
the entire message). Nudge now just includes pending notification count.
- Notifications no longer send directly via tmux — they flow through
the idle nudge only. Urgent notifications reset the fired flag so
the nudge fires sooner.
- Add test-nudge RPC that exercises the actual daemon send path
(test-send was client-side only, didn't test the real code path).
- Update nudge text: "Let your feelings guide your thinking."
- Increase send-keys sleep from 200ms to 500ms.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Agent identity injection: prepend core-personality to all agent prompts
so agents dream as me, not as generic graph workers. Include instructions
to walk the graph and connect new nodes to core concepts.
- Parallel agent scheduling: sequential within type, parallel across types.
Different agent types (linker, organize, replay) run concurrently.
- Linker prompt: graph walking instead of keyword search for connections.
"Explore the local topology and walk the graph until you find the best
connections."
- memory-search fixes: format_results no longer truncates to 5 results,
pipeline default raised to 50, returned file cleared on compaction,
--seen and --seen-full merged, compaction timestamp in --seen output,
max_entries=3 per prompt for steady memory drip.
- Stemmer optimization: strip_suffix now works in-place on a single String
buffer instead of allocating 18 new Strings per word. Note for future:
reversed-suffix trie for O(suffix_len) instead of O(n_rules).
- Transcript: add compaction_timestamp() for --seen display.
- Agent budget configurable (default 4000 from config).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The graph changes fast with 1000+ agents per cycle. Daily was too
slow for the feedback loop. 6-hour cycle means Elo evaluation and
agent reallocation happen 4x per day.
Runs on first tick after daemon start (initialized to past).
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
agent_budget config (default 1000) replaces health-metric-computed
totals. The budget is the total agent runs per cycle — use it all.
Elo distribution is squared for power-law unfairness: top-rated agents
get disproportionately more runs. If linker has Elo 1123 and connector
has 876, linker gets ~7x more runs (squared ratio) vs ~3.5x (linear).
Minimum 2 runs per type so underperformers still get evaluated.
No Elo file → equal distribution as fallback.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
After health metrics compute the total agent budget, read
agent-elo.json and redistribute proportionally to Elo ratings.
Higher-rated agent types get more runs.
Health determines HOW MUCH work. Elo determines WHAT KIND.
Every type gets at least 1 run. If no Elo file exists, falls back
to the existing hardcoded allocation.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
The LCG was producing only 2 distinct matchup pairs due to poor
constants. Switch to xorshift32 for proper coverage of all type pairs.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Replace sort-based ranking with proper Elo system:
- Each agent TYPE has a persistent Elo rating (agent-elo.json)
- Each matchup: pick two random types, grab a recent action from
each, LLM compares, update ratings
- Ratings persist across daily evaluations — natural recency bias
from continuous updates against current opponents
- K=32 for fast adaptation to prompt changes
Usage: poc-memory agent evaluate --matchups 30 --model haiku
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
TIE causes inconsistency in sort (A=B, B=C but A>C breaks ordering).
Force the comparator to always pick a winner. Default to A if response
is unparseable.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
- Use CARGO_MANIFEST_DIR for agent file path (same as defs.rs)
- Dedup affected nodes extracted from reports
- --dry-run shows example comparison prompt without LLM calls
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Chain-of-thought: "say which is better and why" forces clearer
judgment and gives us analysis data for improving agents.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
When both actions are from the same agent, show the instructions once
and just compare the two report outputs + affected nodes. Saves tokens
and makes the comparison cleaner.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>