When api_base_url is configured, agents call the LLM directly via
OpenAI-compatible API (vllm, llama.cpp, etc.) instead of shelling
out to claude CLI. Implements the full tool loop: send prompt, if
tool_calls execute them and send results back, repeat until text.
This enables running agents against local/remote models like
Qwen-27B on a RunPod B200, with no dependency on claude CLI.
Config fields: api_base_url, api_key, api_model.
Falls back to claude CLI when api_base_url is not set.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>