poc-memory now reads from poc-agent's config.json5 as the primary
config source. Memory-specific settings live in a "memory" section;
API credentials are resolved from the shared model/backend config
instead of being duplicated.
- Add "memory" section to ~/.config/poc-agent/config.json5
- poc-memory config.rs: try shared config first, fall back to
legacy JSONL
- API fields (base_url, api_key, model) resolved via
memory.agent_model -> models -> backend lookup
- Add json5 dependency for proper JSON5 parsing
- Update provisioning scripts: hermes -> qwen3_coder tool parser
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ROCm-specific setup with:
- AITER attention backends (VLLM_ROCM_USE_AITER=1)
- Reduced cudagraph capture size (DeltaNet cache conflict)
- BF16 model + FP8 KV cache as default (FP8 weights can be
slower on MI300X due to ROCm kernel maturity)
- FP8=1 flag for benchmarking FP8 model weights
Key for training plan: if FP8 matmuls are slow on MI300X,
the quantize-and-expand strategy needs B200 instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The cursor index is into self.input, but the rendered buffer contains
the prompt prepended to the first line. Need to add prompt.len() to
get the correct character position when scanning the buffer.
- Always display reasoning tokens regardless of reasoning_effort
setting — Qwen 3.5 thinks natively and the reasoning parser
separates it into its own field
- Remove chat_template_kwargs that disabled thinking when
reasoning_effort was "none"
- Add chat_template_kwargs field to ChatRequest for vllm compat
- Update provision script: qwen3_xml tool parser, qwen3 reasoning
parser, 262K context, 95% GPU memory utilization
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sets up vllm with Qwen 2.5 27B Instruct, prefix caching enabled,
Hermes tool call parser for function calling support. Configurable
via environment variables (MODEL, PORT, MAX_MODEL_LEN).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All agent output now goes to the store as nodes instead of
markdown/JSON files. Each node carries a Provenance enum identifying
which agent created it (AgentDigest, AgentConsolidate, AgentFactMine,
AgentKnowledgeObservation, etc — 14 variants total).
Store changes:
- upsert_provenance() method for agent-created nodes
- Provenance enum expanded from 5 to 14 variants
Agent changes:
- digest: writes to store nodes (daily-YYYY-MM-DD.md etc)
- consolidate: reports/actions/logs stored as _consolidation-* nodes
- knowledge: depth DB and agent output stored as _knowledge-* nodes
- enrich: experience-mine results go directly to store
- llm: --no-session-persistence prevents transcript accumulation
Deleted: 14 Python/shell scripts replaced by Rust implementations.
Four layer-2 agents that produce new knowledge from the memory graph:
mine conversations, extract patterns from clusters, find cross-domain
connections, stress-test existing nodes. Output to agent-results/.
knowledge_loop.py runs them on a schedule with quality tracking.
- New spectral module: Laplacian eigendecomposition of the memory graph.
Commands: spectral, spectral-save, spectral-neighbors, spectral-positions,
spectral-suggest. Spectral neighbors expand search results beyond keyword
matching to structural proximity.
- Search: use StoreView trait to avoid 6MB state.bin rewrite on every query.
Append-only retrieval logging. Spectral expansion shows structurally
nearby nodes after text results.
- Fix panic in journal-tail: string truncation at byte 67 could land inside
a multi-byte character (em dash). Now walks back to char boundary.
- Replay queue: show classification and spectral outlier score.
- Knowledge agents: extractor, challenger, connector prompts and runner
scripts for automated graph enrichment.
- memory-search hook: stale state file cleanup (24h expiry).
Add store_helpers.py with shared helpers that call poc-memory commands
(list-keys, render, journal-tail) instead of globbing ~/.claude/memory/*.md
and parsing section headers.
All 9 Python scripts updated: get_semantic_keys(), get_topic_file_index(),
get_recent_journal(), parse_journal_entries(), read_journal_range(),
collect_topic_stems(), and file preview rendering now go through the store.
This completes the clean switch — no script reads archived markdown files.
Faster serialization/deserialization, smaller on disk (4.2MB vs 5.9MB).
Automatic migration from state.json on first load — reads the JSON,
writes state.bin, deletes the old file.
Added list-keys, list-edges, dump-json commands so Python scripts no
longer need to parse the cache directly. Updated bulk-categorize.py
and consolidation-loop.py to use the new CLI commands.