switch memory scoring to /v1/score endpoint

Replace prompt_logprobs-based scoring with the new vLLM /v1/score
endpoint. Much simpler: one API call per memory drop, returns
per-message total_logprob directly. No chunking needed, no OOM risk
— the endpoint only computes logits for scored tokens.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
This commit is contained in:
Kent Overstreet 2026-04-03 00:31:57 -04:00
parent 249726599b
commit e8c3ed3d96
2 changed files with 99 additions and 203 deletions

View file

@ -228,6 +228,15 @@ impl Message {
self.content.as_ref().map_or("", |c| c.as_text())
}
pub fn role_str(&self) -> &str {
match self.role {
Role::System => "system",
Role::User => "user",
Role::Assistant => "assistant",
Role::Tool => "tool",
}
}
fn now() -> Option<String> {
Some(Utc::now().to_rfc3339_opts(chrono::SecondsFormat::Secs, true))
}