- Always display reasoning tokens regardless of reasoning_effort setting — Qwen 3.5 thinks natively and the reasoning parser separates it into its own field - Remove chat_template_kwargs that disabled thinking when reasoning_effort was "none" - Add chat_template_kwargs field to ChatRequest for vllm compat - Update provision script: qwen3_xml tool parser, qwen3 reasoning parser, 262K context, 95% GPU memory utilization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| provision-vllm.sh | ||