Add /v1/completions streaming path with raw token IDs
New stream_completions() in openai.rs sends prompt as token IDs to the completions endpoint instead of JSON messages to chat/completions. Handles <think> tags in the response (split into Reasoning events) and stops on <|im_end|> token. start_stream_completions() on ApiClient provides the same interface as start_stream() but takes token IDs instead of Messages. The turn loop in Agent::turn() uses completions when the tokenizer is initialized, falling back to the chat API otherwise. This allows gradual migration — consciousness uses completions (Qwen tokenizer), Claude Code hook still uses chat API (Anthropic). Co-Authored-By: Proof of Concept <poc@bcachefs.org>
This commit is contained in:
parent
e9765799c4
commit
f458af6dec
3 changed files with 188 additions and 8 deletions
|
|
@ -133,6 +133,34 @@ impl ApiClient {
|
|||
(rx, AbortOnDrop(handle))
|
||||
}
|
||||
|
||||
/// Start a streaming completion with raw token IDs.
|
||||
/// No message formatting — the caller provides the complete prompt as tokens.
|
||||
pub(crate) fn start_stream_completions(
|
||||
&self,
|
||||
prompt_tokens: &[u32],
|
||||
sampling: SamplingParams,
|
||||
priority: Option<i32>,
|
||||
) -> (mpsc::UnboundedReceiver<StreamEvent>, AbortOnDrop) {
|
||||
let (tx, rx) = mpsc::unbounded_channel();
|
||||
let client = self.client.clone();
|
||||
let api_key = self.api_key.clone();
|
||||
let model = self.model.clone();
|
||||
let prompt_tokens = prompt_tokens.to_vec();
|
||||
let base_url = self.base_url.clone();
|
||||
|
||||
let handle = tokio::spawn(async move {
|
||||
let result = openai::stream_completions(
|
||||
&client, &base_url, &api_key, &model,
|
||||
&prompt_tokens, &tx, sampling, priority,
|
||||
).await;
|
||||
if let Err(e) = result {
|
||||
let _ = tx.send(StreamEvent::Error(e.to_string()));
|
||||
}
|
||||
});
|
||||
|
||||
(rx, AbortOnDrop(handle))
|
||||
}
|
||||
|
||||
pub(crate) async fn chat_completion_stream_temp(
|
||||
&self,
|
||||
messages: &[Message],
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue