consciousness/scripts
Kent Overstreet 377e2773bc Add MI300X provisioning script for vllm/Qwen 3.5 27B
ROCm-specific setup with:
- AITER attention backends (VLLM_ROCM_USE_AITER=1)
- Reduced cudagraph capture size (DeltaNet cache conflict)
- BF16 model + FP8 KV cache as default (FP8 weights can be
  slower on MI300X due to ROCm kernel maturity)
- FP8=1 flag for benchmarking FP8 model weights

Key for training plan: if FP8 matmuls are slow on MI300X,
the quantize-and-expand strategy needs B200 instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 14:40:15 -04:00
..
provision-mi300x.sh Add MI300X provisioning script for vllm/Qwen 3.5 27B 2026-03-19 14:40:15 -04:00
provision-vllm.sh tui: fix cursor position calculation 2026-03-19 00:45:07 -04:00