Add direct API backend for agent execution

When api_base_url is configured, agents call the LLM directly via OpenAI-compatible API (vllm, llama.cpp, etc.) instead of shelling out to claude CLI. Implements the full tool loop: send prompt, if tool_calls execute them and send results back, repeat until text. This enables running agents against local/remote models like Qwen-27B on a RunPod B200, with no dependency on claude CLI. Config fields: api_base_url, api_key, api_model. Falls back to claude CLI when api_base_url is not set. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 23:05:14 -04:00 · 2026-03-18 23:05:14 -04:00 · a29b6d4c5d
commit a29b6d4c5d
parent 1b48e57f34
6 changed files with 145 additions and 1 deletions
--- a/poc-memory/Cargo.toml
+++ b/poc-memory/Cargo.toml
@ -22,6 +22,7 @@ paste = "1"
 jobkit = { git = "https://evilpiepirate.org/git/jobkit.git/" }
 jobkit-daemon = { git = "https://evilpiepirate.org/git/jobkit-daemon.git/" }
 poc-agent = { path = "../poc-agent" }
+tokio = { version = "1", features = ["rt-multi-thread"] }
 redb = "2"
 log = "0.4"
 ratatui = "0.29"