Replace token counting with token generation via HuggingFace tokenizer

Add agent/tokenizer.rs with global Qwen 3.5 tokenizer that generates actual token IDs including chat template wrapping. ContextEntry now stores token_ids: Vec<u32> instead of tokens: usize — the count is derived from the length. ContextEntry::new() tokenizes automatically via the global tokenizer. ContextSection::push_entry() takes a raw ConversationEntry and tokenizes it. set_message() re-tokenizes without needing an external tokenizer parameter. Token IDs include the full chat template: <|im_start|>role\ncontent <|im_end|>\n — so concatenating token_ids across entries produces a ready-to-send prompt for vLLM's /v1/completions endpoint. The old tiktoken CoreBPE is now unused on Agent (will be removed in a followup). Token counts are now exact for Qwen 3.5 instead of the ~85-90% approximation from cl100k_base. Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 11:20:03 -04:00 · 2026-04-08 11:20:03 -04:00 · 5e4067c04f
commit 5e4067c04f
parent 70ee7abea5
10 changed files with 540 additions and 97 deletions
--- a/src/main.rs
+++ b/src/main.rs
@ -950,6 +950,13 @@ fn main() {
        return;
    }

+    // Initialize the Qwen tokenizer for direct token generation
+    let tokenizer_path = dirs::home_dir().unwrap_or_default()
+        .join(".consciousness/tokenizer-qwen35.json");
+    if tokenizer_path.exists() {
+        crate::agent::tokenizer::init(&tokenizer_path.to_string_lossy());
+    }
+
    let cli = Cli::parse();

    if let Err(e) = cli.command.run() {