HOGWILD (no pause), rank-256, channel scaling, CUDA IPC validated (851/851 params, forward+backward confirmed), dream-loop-as-trainer, Anthropic instruction stripping method, diversity as regularization, in-place checkpoint sync, three-tier training pipeline. |
||
|---|---|---|
| .. | ||
| checkpoint | ||
| apollo_mini.py | ||
| apollo_worker.py | ||
| DESIGN.md | ||
| export_weights.py | ||
| start_vllm_with_apollo.sh | ||
| train.py | ||
| training_example.py | ||
| vllm_export_hook.py | ||
| weight_mapping.py | ||