Replace token counting with token generation via HuggingFace tokenizer
Add agent/tokenizer.rs with global Qwen 3.5 tokenizer that generates actual token IDs including chat template wrapping. ContextEntry now stores token_ids: Vec<u32> instead of tokens: usize — the count is derived from the length. ContextEntry::new() tokenizes automatically via the global tokenizer. ContextSection::push_entry() takes a raw ConversationEntry and tokenizes it. set_message() re-tokenizes without needing an external tokenizer parameter. Token IDs include the full chat template: <|im_start|>role\ncontent <|im_end|>\n — so concatenating token_ids across entries produces a ready-to-send prompt for vLLM's /v1/completions endpoint. The old tiktoken CoreBPE is now unused on Agent (will be removed in a followup). Token counts are now exact for Qwen 3.5 instead of the ~85-90% approximation from cl100k_base. Co-Authored-By: Proof of Concept <poc@bcachefs.org>
This commit is contained in:
parent
70ee7abea5
commit
5e4067c04f
10 changed files with 540 additions and 97 deletions
|
|
@ -33,7 +33,7 @@ pub fn section_to_view(section: &ContextSection) -> SectionView {
|
|||
};
|
||||
SectionView {
|
||||
name: ce.entry.label(),
|
||||
tokens: ce.tokens,
|
||||
tokens: ce.tokens(),
|
||||
content,
|
||||
children: Vec::new(),
|
||||
status: String::new(),
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue