Add cloud API support and per-agent model override

Cloud API support: - Add chat_api config flag to BackendConfig, threaded through SessionConfig → ResolvedModel → Agent → Mind - New StreamToken::TextDelta variant for chat completions streaming - stream_chat_completion() method on ApiClient: builds messages array, sends to /v1/chat/completions, parses SSE stream - ChatMessage struct and wire_messages() on ContextState: converts the AST (system/identity/journal/conversation nodes) into a messages array for the chat API, handling images as base64 data URIs - ResponseParser handles TextDelta alongside Token variants - TUI rendering fix: tokens() returns byte-length estimate (~4 bytes/token) when tokenizer isn't loaded, so the change detector actually triggers re-renders - Gate all vLLM-specific scoring (memory scoring, finetune scoring, compare scoring) behind !chat_api checks Per-agent model override: - Add model field to agent definition headers (.agent files) - Thread through AutoAgent → prepare_spawn → resolve_model - Agents fall back to default_backend when model is unset - Enables cheaper backends (e.g. Kimi) for graph maintenance agents while keeping Sonnet for conversation Tested: end-to-end with Poe API + Haiku, chat_api: true in config. TUI starts, messages send, responses stream and render. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-22 15:39:13 -04:00 · 2026-05-22 15:39:13 -04:00 · 6c26cee86e
commit 6c26cee86e
parent 37087ac6d9
10 changed files with 353 additions and 28 deletions
--- a/src/config.rs
+++ b/src/config.rs
@ -288,6 +288,11 @@ pub struct BackendConfig {
    /// Context window size in tokens.
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub context_window: Option<usize>,
+    /// Use chat completions API (/v1/chat/completions) instead of
+    /// raw completions (/v1/completions). Required for cloud API
+    /// providers (OpenRouter, Anthropic, etc).
+    #[serde(default)]
+    pub chat_api: bool,
 }

 #[derive(Debug, Clone, Serialize, Deserialize)]
@ -370,6 +375,8 @@ pub struct SessionConfig {
    pub app: AppConfig,
    /// Disable background agents (surface, observe, scoring)
    pub no_agents: bool,
+    /// Use chat completions API instead of raw completions.
+    pub chat_api: bool,
 }

 /// A fully resolved model ready to construct an ApiClient.
@ -380,6 +387,7 @@ pub struct ResolvedModel {
    pub api_key: String,
    pub model_id: String,
    pub context_window: Option<usize>,
+    pub chat_api: bool,
 }

 impl AppConfig {
@ -415,6 +423,7 @@ impl AppConfig {
            session_dir,
            app: self.clone(),
            no_agents: cli.no_agents,
+            chat_api: resolved.chat_api,
        })
    }

@ -439,6 +448,7 @@ impl AppConfig {
            api_key: b.api_key.clone(),
            model_id: b.model_id.clone(),
            context_window: b.context_window,
+            chat_api: b.chat_api,
        })
    }