agent: bundle sampling fields as SamplingParams on AgentState

Collapse the split `temperature` / `top_p` / `top_k` fields on
AgentState into a single `sampling: SamplingParams` struct, mirroring
how the wire-level fields flow into the Generate RPC. Adds
`max_tokens` to SamplingParams so it's actually plumbed end to end
(previously the client had a hardcoded 4096 fallback inside
`run_session_generate`).

AgentState construction sites now set `sampling: SamplingParams { ...
max_tokens: 4096 }` as the default. The assignment sites in
oneshot.rs / subconscious.rs / unconscious.rs switch from
`st.temperature = X` to `st.sampling.temperature = X`.

`stream_session_mm` takes `SamplingParams` directly; the
`sampling_max_tokens()` helper goes away. `pb::GenerateRequest` is
populated with `sampling.max_tokens` (and the other fields) in
`run_session_generate`. SamplingParams is `pub` so it can be
embedded in the public AgentState without a visibility warning.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
This commit is contained in:
Kent Overstreet 2026-04-24 12:37:20 -04:00
commit be6ba4e9a5
6 changed files with 29 additions and 32 deletions

View file

@ -631,7 +631,7 @@ impl Subconscious {
{
let mut st = forked.state.lock().await;
st.provenance = auto.name.clone();
st.temperature = auto.temperature;
st.sampling.temperature = auto.temperature;
// Surface agent gets near-interactive priority;
// other subconscious agents get lower priority.
st.priority = Some(if auto.name == "surface" { 1 } else { auto.priority });