agent: rewrite view_image to emit Image leaves

view_image now reads the file, grabs dimensions via imagesize (no full decode), and pushes a user-role branch containing a NodeBody::Image leaf straight into the conversation. The tool_result is just a short acknowledgment — the actual pixels ride in the Image leaf for the API layer to extract into multi_modal_data. Drops the capture_tmux_pane path, which had no business living under "vision" (tmux text capture belongs in bash or a dedicated tool, and this one just returned rendered text anyway). Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 18:06:25 -04:00 · 2026-04-16 18:06:25 -04:00 · 91106deaa1
commit 91106deaa1
parent 0bf71b9110
4 changed files with 48 additions and 72 deletions
--- a/Cargo.toml
+++ b/Cargo.toml
@ -68,6 +68,7 @@ hyper-util = { version = "0.1", features = ["tokio"], default-features = false }
 http-body-util = "0.1"
 bytes = "1"
 base64 = "0.22"
+imagesize = "0.14"

 rustls = "0.23"
 tokio-rustls = "0.26"