poc-memory now reads from poc-agent's config.json5 as the primary
config source. Memory-specific settings live in a "memory" section;
API credentials are resolved from the shared model/backend config
instead of being duplicated.
- Add "memory" section to ~/.config/poc-agent/config.json5
- poc-memory config.rs: try shared config first, fall back to
legacy JSONL
- API fields (base_url, api_key, model) resolved via
memory.agent_model -> models -> backend lookup
- Add json5 dependency for proper JSON5 parsing
- Update provisioning scripts: hermes -> qwen3_coder tool parser
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ROCm-specific setup with:
- AITER attention backends (VLLM_ROCM_USE_AITER=1)
- Reduced cudagraph capture size (DeltaNet cache conflict)
- BF16 model + FP8 KV cache as default (FP8 weights can be
slower on MI300X due to ROCm kernel maturity)
- FP8=1 flag for benchmarking FP8 model weights
Key for training plan: if FP8 matmuls are slow on MI300X,
the quantize-and-expand strategy needs B200 instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>