training: integrate /train into vLLM process (no separate daemon)
Remove standalone worker.py daemon. Training now runs inside vLLM: - train_router.py: FastAPI router patched into vLLM's build_app() - /train served on same port as /completions, /score - Lazy-loads HF model with vLLM weight views on first request - HOGWILD training: no pause, weights updated in-place The previous architecture had a separate daemon on port 8080 that communicated with vLLM via pause/resume endpoints. This was wrong - training should run in-process, sharing GPU memory directly. Co-Authored-By: Proof of Concept <poc@bcachefs.org>
This commit is contained in:
parent
2f08149fab
commit
7e7e9a4b69
6 changed files with 320 additions and 542 deletions
|
|
@ -20,7 +20,6 @@ dev = ["pytest"]
|
|||
apollo = "apollo_plugin:register"
|
||||
|
||||
[project.scripts]
|
||||
apollo-worker = "apollo_plugin.worker:main"
|
||||
apollo-checkpoint = "apollo_plugin.checkpoint_sync:main"
|
||||
|
||||
[tool.setuptools.packages.find]
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue