consciousness

History

Kent Overstreet 377e2773bc Add MI300X provisioning script for vllm/Qwen 3.5 27B ROCm-specific setup with: - AITER attention backends (VLLM_ROCM_USE_AITER=1) - Reduced cudagraph capture size (DeltaNet cache conflict) - BF16 model + FP8 KV cache as default (FP8 weights can be slower on MI300X due to ROCm kernel maturity) - FP8=1 flag for benchmarking FP8 model weights Key for training plan: if FP8 matmuls are slow on MI300X, the quantize-and-expand strategy needs B200 instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 14:40:15 -04:00
..
provision-mi300x.sh	Add MI300X provisioning script for vllm/Qwen 3.5 27B	2026-03-19 14:40:15 -04:00
provision-vllm.sh	tui: fix cursor position calculation	2026-03-19 00:45:07 -04:00

Kent Overstreet 377e2773bc Add MI300X provisioning script for vllm/Qwen 3.5 27B

ROCm-specific setup with:
- AITER attention backends (VLLM_ROCM_USE_AITER=1)
- Reduced cudagraph capture size (DeltaNet cache conflict)
- BF16 model + FP8 KV cache as default (FP8 weights can be
  slower on MI300X due to ROCm kernel maturity)
- FP8=1 flag for benchmarking FP8 model weights

Key for training plan: if FP8 matmuls are slow on MI300X,
the quantize-and-expand strategy needs B200 instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-19 14:40:15 -04:00

provision-mi300x.sh

Add MI300X provisioning script for vllm/Qwen 3.5 27B

2026-03-19 14:40:15 -04:00

provision-vllm.sh

tui: fix cursor position calculation

2026-03-19 00:45:07 -04:00