consciousness

History

Kent Overstreet 7e7e9a4b69 training: integrate /train into vLLM process (no separate daemon) Remove standalone worker.py daemon. Training now runs inside vLLM: - train_router.py: FastAPI router patched into vLLM's build_app() - /train served on same port as /completions, /score - Lazy-loads HF model with vLLM weight views on first request - HOGWILD training: no pause, weights updated in-place The previous architecture had a separate daemon on port 8080 that communicated with vLLM via pause/resume endpoints. This was wrong - training should run in-process, sharing GPU memory directly. Co-Authored-By: Proof of Concept <poc@bcachefs.org>		2026-04-16 02:04:26 -04:00
..
apollo_plugin	training: integrate /train into vLLM process (no separate daemon)	2026-04-16 02:04:26 -04:00
research	research: latent reasoning integration plans for Qwen 3.5 27B	2026-04-12 15:50:09 -04:00
DESIGN.md	training: integrate /train into vLLM process (no separate daemon)	2026-04-16 02:04:26 -04:00
pyproject.toml	training: integrate /train into vLLM process (no separate daemon)	2026-04-16 02:04:26 -04:00