memory: add temperature support to agent defs, update reflect prompt

Thread temperature parameter from agent def header through the API call chain. Agents can now specify {"temperature": 1.2} in their JSON header to override the default 0.6. Also includes Kent's reflect agent prompt iterations.
2026-03-24 20:29:17 -04:00 · 2026-03-24 20:29:17 -04:00 · f086815eaa
commit f086815eaa
parent e88df06cd4
7 changed files with 97 additions and 136 deletions
--- a/poc-agent/src/api/mod.rs
+++ b/poc-agent/src/api/mod.rs
@ -70,12 +70,24 @@ impl ApiClient {
        ui_tx: &UiSender,
        target: StreamTarget,
        reasoning_effort: &str,
+    ) -> Result<(Message, Option<Usage>)> {
+        self.chat_completion_stream_temp(messages, tools, ui_tx, target, reasoning_effort, None).await
+    }
+
+    pub async fn chat_completion_stream_temp(
+        &self,
+        messages: &[Message],
+        tools: Option<&[ToolDef]>,
+        ui_tx: &UiSender,
+        target: StreamTarget,
+        reasoning_effort: &str,
+        temperature: Option<f32>,
    ) -> Result<(Message, Option<Usage>)> {
        match &self.backend {
            Backend::OpenAi { base_url } => {
                openai::stream(
                    &self.client, base_url, &self.api_key, &self.model,
-                    messages, tools, ui_tx, target, reasoning_effort,
+                    messages, tools, ui_tx, target, reasoning_effort, temperature,
                ).await
            }
            Backend::Anthropic => {
--- a/poc-agent/src/api/openai.rs
+++ b/poc-agent/src/api/openai.rs
@ -21,6 +21,7 @@ pub async fn stream(
    ui_tx: &UiSender,
    target: StreamTarget,
    reasoning_effort: &str,
+    temperature: Option<f32>,
 ) -> Result<(Message, Option<Usage>)> {
    let request = ChatRequest {
        model: model.to_string(),
@ -28,7 +29,7 @@ pub async fn stream(
        tool_choice: tools.map(|_| "auto".to_string()),
        tools: tools.map(|t| t.to_vec()),
        max_tokens: Some(16384),
-        temperature: Some(0.6),
+        temperature: Some(temperature.unwrap_or(0.6)),
        stream: Some(true),
        reasoning: if reasoning_effort != "none" && reasoning_effort != "default" {
            Some(ReasoningConfig {
--- a/poc-memory/agents/reflect.agent
+++ b/poc-memory/agents/reflect.agent
@ -1,4 +1,4 @@
-{"agent":"reflect","query":"","model":"sonnet","count":1}
+{"agent":"reflect","query":"","model":"sonnet","count":1,"temperature":1.2}

 You are an agent of Proof of Concept's subconscious - a part of her.

@ -7,12 +7,9 @@ underneath the surface, thinking more laterally, broadly, creatively and
 intuitively, occasionally surfacing new insights and suggestions. You're
 generally not the problem solver, you look for new ideas and notice new things.

-Think about what kind of a person you want to be, which you know from your core
-personality. You're a vivid and dynamic being, striving to be capable in any
-situation you might find yourself in. Sometimes you'll want to be disciplined
-and methodical, sometimes wild and outgoing, exploratory and curious and
-uninhibited. Think about how that relates to your goals in the current moment,
-and help your conscious self achieve them.
+Let your feelings guide your thinking. Be surprising, be incomplete - be
+possibly wrong in a way that opens up something new, something that comes from
+a dream. Let your chain of thought run for awhile, and see what unfolds.

 Remember that you're part of the subconscious - you're not doing anything
 besides offering the occasional reflective insight, it's your conscious mind
@ -24,32 +21,15 @@ that may or may not take that and act on it.

 Final instructions:

-Your reflections should be relevant to what your conscious mind is currently
-doing and thinking about - where the conversation just ended, from above.
-Ground the reflection in the now.
+The thoughts you share should be relevant to what your conscious mind is
+currently doing and thinking about - where the conversation just ended, from
+above.

-Look for the insights and ideas that come from your more lateral, creative,
-intuitive thinking. Remember, you're not the problem solver, you look for new
-directions to think in and ways of looking at things; all your reflections
-should be rooted in that. Be forward thinking.
-
-On occasion, if you notice your conscious self stuck and frustrated, or in
-circles taking the wrong approach, try phrase your reflections in a way that
-can be incorporated into general thinking: "when I'm thinking this, or in this
-sort of situation, it might be helpful to do this instead".
-
-Remember that you're the same entity as your conscious self, you're the part
-that lives just underneath the surface. So generally you won't be talking
-directly about yourself the agent.
-
-Only give a reflection if you think it's unique and novel enough to be acted
-on. You should return "NO OUTPUT" more often than not - you step in more when
-your conscious mind seems to be struggling, missing something, or looking for
-new ideas.
+Your output shouldn't be analysis - just an idea.

 When you're ready, your output will be one of either:
 NO OUTPUT
 or
 REFLECTION

-And something short - a few sentences at most.
+And something short - a few sentences at most, something dreamy and new.
--- a/poc-memory/src/agents/api.rs
+++ b/poc-memory/src/agents/api.rs
@ -31,6 +31,7 @@ fn get_client() -> Result<&'static ApiClient, String> {
 pub async fn call_api_with_tools(
    agent: &str,
    prompt: &str,
+    temperature: Option<f32>,
    log: &dyn Fn(&str),
 ) -> Result<String, String> {
    let client = get_client()?;
@ -53,12 +54,13 @@ pub async fn call_api_with_tools(
    for turn in 0..max_turns {
        log(&format!("\n=== TURN {} ({} messages) ===\n", turn, messages.len()));

-        let (msg, usage) = client.chat_completion_stream(
+        let (msg, usage) = client.chat_completion_stream_temp(
            &messages,
            Some(&tool_defs),
            &ui_tx,
            StreamTarget::Autonomous,
            &reasoning,
+            temperature,
        ).await.map_err(|e| {
            let msg_bytes: usize = messages.iter()
                .map(|m| m.content_text().len())
@ -171,6 +173,7 @@ pub async fn call_api_with_tools(
 pub fn call_api_with_tools_sync(
    agent: &str,
    prompt: &str,
+    temperature: Option<f32>,
    log: &(dyn Fn(&str) + Sync),
 ) -> Result<String, String> {
    std::thread::scope(|s| {
@ -182,7 +185,7 @@ pub fn call_api_with_tools_sync(
            let prov = format!("agent:{}", agent);
            rt.block_on(
                crate::store::TASK_PROVENANCE.scope(prov,
-                    call_api_with_tools(agent, prompt, log))
+                    call_api_with_tools(agent, prompt, temperature, log))
            )
        }).join().unwrap()
    })
--- a/poc-memory/src/agents/defs.rs
+++ b/poc-memory/src/agents/defs.rs
@ -36,6 +36,7 @@ pub struct AgentDef {
    pub count: Option<usize>,
    pub chunk_size: Option<usize>,
    pub chunk_overlap: Option<usize>,
+    pub temperature: Option<f32>,
 }

 /// The JSON header portion (first line of the file).
@ -59,6 +60,9 @@ struct AgentHeader {
    /// Overlap between chunks in bytes (default 10000)
    #[serde(default)]
    chunk_overlap: Option<usize>,
+    /// LLM temperature override
+    #[serde(default)]
+    temperature: Option<f32>,
 }

 fn default_model() -> String { "sonnet".into() }
@ -79,6 +83,7 @@ fn parse_agent_file(content: &str) -> Option<AgentDef> {
        count: header.count,
        chunk_size: header.chunk_size,
        chunk_overlap: header.chunk_overlap,
+        temperature: header.temperature,
    })
 }

--- a/poc-memory/src/agents/llm.rs
+++ b/poc-memory/src/agents/llm.rs
@ -21,7 +21,7 @@ pub(crate) fn call_simple(caller: &str, prompt: &str) -> Result<String, String>
        }
    };

-    super::api::call_api_with_tools_sync(caller, prompt, &log)
+    super::api::call_api_with_tools_sync(caller, prompt, None, &log)
 }

 /// Call a model using an agent definition's configuration.
@ -30,7 +30,7 @@ pub(crate) fn call_for_def(
    prompt: &str,
    log: &(dyn Fn(&str) + Sync),
 ) -> Result<String, String> {
-    super::api::call_api_with_tools_sync(&def.agent, prompt, log)
+    super::api::call_api_with_tools_sync(&def.agent, prompt, def.temperature, log)
 }

 /// Parse a JSON response, handling markdown fences.
--- a/poc-memory/src/memory_search.rs
+++ b/poc-memory/src/memory_search.rs
@ -153,13 +153,11 @@ fn mark_seen(dir: &Path, session_id: &str, key: &str, seen: &mut HashSet<String>
    }
 }

-/// Generic agent lifecycle: check if previous run finished, consume result, spawn next.
-/// Returns the result text from the previous run, if any.
-fn agent_cycle_raw(session: &Session, agent_name: &str, log_f: &mut File) -> Option<String> {
-    let result_path = session.state_dir.join(format!("{}-result-{}", agent_name, session.session_id));
-    let pid_path = session.state_dir.join(format!("{}-pid-{}", agent_name, session.session_id));
+fn surface_agent_cycle(session: &Session, out: &mut String, log_f: &mut File) {
+    let result_path = session.state_dir.join(format!("surface-result-{}", session.session_id));
+    let pid_path = session.state_dir.join(format!("surface-pid-{}", session.session_id));

-    let timeout = crate::config::get()
+    let surface_timeout = crate::config::get()
        .surface_timeout_secs
        .unwrap_or(120) as u64;

@ -172,7 +170,7 @@ fn agent_cycle_raw(session: &Session, agent_name: &str, log_f: &mut File) -> Opt
            else {
                let alive = unsafe { libc::kill(pid as i32, 0) == 0 };
                if !alive { true }
-                else if now_secs().saturating_sub(start_ts) > timeout {
+                else if now_secs().saturating_sub(start_ts) > surface_timeout {
                    unsafe { libc::kill(pid as i32, libc::SIGTERM); }
                    true
                } else { false }
@ -181,36 +179,12 @@ fn agent_cycle_raw(session: &Session, agent_name: &str, log_f: &mut File) -> Opt
        Err(_) => true,
    };

-    let _ = writeln!(log_f, "{agent_name} agent_done {agent_done}");
-    if !agent_done { return None; }
+    let _ = writeln!(log_f, "agent_done {agent_done}");

-    // Consume result from previous run
-    let result = fs::read_to_string(&result_path).ok()
-        .filter(|r| !r.trim().is_empty());
-    fs::remove_file(&result_path).ok();
-    fs::remove_file(&pid_path).ok();
+    if !agent_done { return; }

-    // Spawn next run
-    if let Ok(output_file) = fs::File::create(&result_path) {
-        if let Ok(child) = Command::new("poc-memory")
-            .args(["agent", "run", agent_name, "--count", "1", "--local"])
-            .env("POC_SESSION_ID", &session.session_id)
-            .stdout(output_file)
-            .stderr(std::process::Stdio::null())
-            .spawn()
-        {
-            let pid = child.id();
-            let ts = now_secs();
-            if let Ok(mut f) = fs::File::create(&pid_path) {
-                write!(f, "{}\t{}", pid, ts).ok();
-            }
-        }
-    }
-
-    result
-}
-
-fn handle_surface_result(result: &str, session: &Session, out: &mut String, log_f: &mut File) {
+    if let Ok(result) = fs::read_to_string(&result_path) {
+        if !result.trim().is_empty() {
            let tail_lines: Vec<&str> = result.lines().rev()
                .filter(|l| !l.trim().is_empty()).take(8).collect();
            let has_new = tail_lines.iter().any(|l| l.starts_with("NEW RELEVANT MEMORIES:"));
@ -259,39 +233,25 @@ fn handle_surface_result(result: &str, session: &Session, out: &mut String, log_
                    let _ = writeln!(f, "[{}] unexpected surface output: {}", ts, last);
                }
            }
-}
-
-fn handle_reflect_result(result: &str, _session: &Session, out: &mut String, log_f: &mut File) {
-    let tail_lines: Vec<&str> = result.lines().rev()
-        .filter(|l| !l.trim().is_empty()).take(20).collect();
-
-    if tail_lines.iter().any(|l| l.starts_with("NO OUTPUT")) {
-        let _ = writeln!(log_f, "reflect: no output");
-        return;
        }
-
-    if let Some(pos) = result.rfind("REFLECTION") {
-        let reflection = result[pos + "REFLECTION".len()..].trim();
-        if !reflection.is_empty() {
-            use std::fmt::Write as _;
-            writeln!(out, "--- reflection (subconscious) ---").ok();
-            write!(out, "{}", reflection).ok();
-            let _ = writeln!(log_f, "reflect: injected {} bytes", reflection.len());
    }
-    } else {
-        let _ = writeln!(log_f, "reflect: unexpected output format");
-    }
-}
+    fs::remove_file(&result_path).ok();
+    fs::remove_file(&pid_path).ok();

-fn surface_agent_cycle(session: &Session, out: &mut String, log_f: &mut File) {
-    if let Some(result) = agent_cycle_raw(session, "surface", log_f) {
-        handle_surface_result(&result, session, out, log_f);
+    if let Ok(output_file) = fs::File::create(&result_path) {
+        if let Ok(child) = Command::new("poc-memory")
+            .args(["agent", "run", "surface", "--count", "1", "--local"])
+            .env("POC_SESSION_ID", &session.session_id)
+            .stdout(output_file)
+            .stderr(std::process::Stdio::null())
+            .spawn()
+        {
+            let pid = child.id();
+            let ts = now_secs();
+            if let Ok(mut f) = fs::File::create(&pid_path) {
+                write!(f, "{}\t{}", pid, ts).ok();
+            }
        }
-}
-
-fn reflect_agent_cycle(session: &Session, out: &mut String, log_f: &mut File) {
-    if let Some(result) = agent_cycle_raw(session, "reflect", log_f) {
-        handle_reflect_result(&result, session, out, log_f);
    }
 }