Sets up vllm with Qwen 2.5 27B Instruct, prefix caching enabled,
Hermes tool call parser for function calling support. Configurable
via environment variables (MODEL, PORT, MAX_MODEL_LEN).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>