consciousness

kent/consciousness

Fork 0

Commit graph

Author	SHA1	Message	Date
Kent Overstreet	f83325b44d	Fix poc-agent for vllm/Qwen 3.5: reasoning display, tool parser - Always display reasoning tokens regardless of reasoning_effort setting — Qwen 3.5 thinks natively and the reasoning parser separates it into its own field - Remove chat_template_kwargs that disabled thinking when reasoning_effort was "none" - Add chat_template_kwargs field to ChatRequest for vllm compat - Update provision script: qwen3_xml tool parser, qwen3 reasoning parser, 262K context, 95% GPU memory utilization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 00:06:26 -04:00
Kent Overstreet	49ccdf87e1	Add vllm provisioning script for RunPod GPU instances Sets up vllm with Qwen 2.5 27B Instruct, prefix caching enabled, Hermes tool call parser for function calling support. Configurable via environment variables (MODEL, PORT, MAX_MODEL_LEN). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 23:13:04 -04:00

Author

SHA1

Message

Date

Kent Overstreet

f83325b44d

Fix poc-agent for vllm/Qwen 3.5: reasoning display, tool parser

- Always display reasoning tokens regardless of reasoning_effort
  setting — Qwen 3.5 thinks natively and the reasoning parser
  separates it into its own field
- Remove chat_template_kwargs that disabled thinking when
  reasoning_effort was "none"
- Add chat_template_kwargs field to ChatRequest for vllm compat
- Update provision script: qwen3_xml tool parser, qwen3 reasoning
  parser, 262K context, 95% GPU memory utilization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-19 00:06:26 -04:00

Kent Overstreet

49ccdf87e1

Add vllm provisioning script for RunPod GPU instances

Sets up vllm with Qwen 2.5 27B Instruct, prefix caching enabled,
Hermes tool call parser for function calling support. Configurable
via environment variables (MODEL, PORT, MAX_MODEL_LEN).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-18 23:13:04 -04:00

2 commits