Commit graph

3 commits

Author SHA1 Message Date
ProofOfConcept
6a7ec9732b tui: fix cursor position calculation
The cursor index is into self.input, but the rendered buffer contains
the prompt prepended to the first line. Need to add prompt.len() to
get the correct character position when scanning the buffer.
2026-03-19 00:45:07 -04:00
Kent Overstreet
f83325b44d Fix poc-agent for vllm/Qwen 3.5: reasoning display, tool parser
- Always display reasoning tokens regardless of reasoning_effort
  setting — Qwen 3.5 thinks natively and the reasoning parser
  separates it into its own field
- Remove chat_template_kwargs that disabled thinking when
  reasoning_effort was "none"
- Add chat_template_kwargs field to ChatRequest for vllm compat
- Update provision script: qwen3_xml tool parser, qwen3 reasoning
  parser, 262K context, 95% GPU memory utilization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 00:06:26 -04:00
Kent Overstreet
49ccdf87e1 Add vllm provisioning script for RunPod GPU instances
Sets up vllm with Qwen 2.5 27B Instruct, prefix caching enabled,
Hermes tool call parser for function calling support. Configurable
via environment variables (MODEL, PORT, MAX_MODEL_LEN).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 23:13:04 -04:00