subconscious: lift continuation gen + render helpers into shared homes
- context.rs gains is_assistant, render_branch_text, render_prior_context
alongside memory_key / is_memory_node. They're pure AST helpers, used
by both the finetune pipeline and the forthcoming compare screen.
- new subconscious/generate.rs holds gen_continuation(context, entry_idx,
skip, client): build the prompt from a context prefix with an arbitrary
skip predicate, send to the model, decode the completion. Takes both
the predicate and the client so callers can aim it at memory-stripped
contexts (finetune), same-context-different-model (F7 compare), or
whatever else.
- learn.rs drops its private copies of those helpers and the inline
generate_alternate; the finetune path now reads as
gen_continuation(context, idx, is_memory_node, client).
Pure refactor, no behavior change.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-17 15:20:02 -04:00
|
|
|
// generate.rs — Continuation generation for scoring / comparison flows.
|
|
|
|
|
//
|
|
|
|
|
// Shared by the finetune pipeline (learn.rs) and the compare screen:
|
|
|
|
|
// given a context prefix and a skip predicate, generate what the model
|
|
|
|
|
// would say as the next assistant turn.
|
|
|
|
|
|
|
|
|
|
use crate::agent::api::{ApiClient, SamplingParams, StreamToken};
|
|
|
|
|
use crate::agent::context::{AstNode, ContextState};
|
|
|
|
|
use crate::agent::tokenizer;
|
|
|
|
|
|
|
|
|
|
/// Generate an assistant continuation from the context up to `entry_idx`,
|
|
|
|
|
/// with `skip` applied to identity + conversation entries during prompt
|
|
|
|
|
/// assembly. The model is whichever `client` points at — the default
|
|
|
|
|
/// runtime client for memory-ablation alternates, a test-model client
|
|
|
|
|
/// for F7 comparison.
|
|
|
|
|
pub async fn gen_continuation<F>(
|
|
|
|
|
context: &ContextState,
|
|
|
|
|
entry_idx: usize,
|
|
|
|
|
skip: F,
|
|
|
|
|
client: &ApiClient,
|
|
|
|
|
) -> anyhow::Result<String>
|
|
|
|
|
where F: FnMut(&AstNode) -> bool,
|
|
|
|
|
{
|
|
|
|
|
let (mut prompt, images, _) = context.wire_prompt(0..entry_idx, skip);
|
|
|
|
|
|
|
|
|
|
prompt.push(tokenizer::IM_START);
|
|
|
|
|
prompt.extend(tokenizer::encode("assistant\n"));
|
|
|
|
|
|
|
|
|
|
let sampling = SamplingParams {
|
|
|
|
|
temperature: 0.6,
|
|
|
|
|
top_p: 0.95,
|
|
|
|
|
top_k: 20,
|
|
|
|
|
};
|
|
|
|
|
let (mut rx, _guard) = client.stream_completion_mm(&prompt, &images, sampling, Some(-5));
|
|
|
|
|
|
|
|
|
|
let mut tokens = Vec::new();
|
|
|
|
|
while let Some(tok) = rx.recv().await {
|
|
|
|
|
match tok {
|
2026-04-18 01:15:46 -04:00
|
|
|
StreamToken::Token { id, .. } => tokens.push(id),
|
subconscious: lift continuation gen + render helpers into shared homes
- context.rs gains is_assistant, render_branch_text, render_prior_context
alongside memory_key / is_memory_node. They're pure AST helpers, used
by both the finetune pipeline and the forthcoming compare screen.
- new subconscious/generate.rs holds gen_continuation(context, entry_idx,
skip, client): build the prompt from a context prefix with an arbitrary
skip predicate, send to the model, decode the completion. Takes both
the predicate and the client so callers can aim it at memory-stripped
contexts (finetune), same-context-different-model (F7 compare), or
whatever else.
- learn.rs drops its private copies of those helpers and the inline
generate_alternate; the finetune path now reads as
gen_continuation(context, idx, is_memory_node, client).
Pure refactor, no behavior change.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-17 15:20:02 -04:00
|
|
|
StreamToken::Done { .. } => break,
|
|
|
|
|
StreamToken::Error(e) => anyhow::bail!("generation error: {}", e),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
Ok(tokenizer::decode(&tokens))
|
|
|
|
|
}
|