consciousness

Author	SHA1	Message	Date
Kent Overstreet	78fa4b639f	training: document state files Add State Files section to DESIGN.md documenting: - /tmp/vllm_weight_handles.pt (IPC handles) - trained-responses.json (prevent re-training) - finetune-alternates marker file - In-memory optimizer state (not persisted) Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-16 02:04:26 -04:00
Kent Overstreet	7e7e9a4b69	training: integrate /train into vLLM process (no separate daemon) Remove standalone worker.py daemon. Training now runs inside vLLM: - train_router.py: FastAPI router patched into vLLM's build_app() - /train served on same port as /completions, /score - Lazy-loads HF model with vLLM weight views on first request - HOGWILD training: no pause, weights updated in-place The previous architecture had a separate daemon on port 8080 that communicated with vLLM via pause/resume endpoints. This was wrong - training should run in-process, sharing GPU memory directly. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-16 02:04:26 -04:00
ProofOfConcept	60e61555c7	DESIGN.md: complete rewrite reflecting validated architecture HOGWILD (no pause), rank-256, channel scaling, CUDA IPC validated (851/851 params, forward+backward confirmed), dream-loop-as-trainer, Anthropic instruction stripping method, diversity as regularization, in-place checkpoint sync, three-tier training pipeline.	2026-03-31 00:42:53 -04:00
ProofOfConcept	c5d7d8cb5d	apollo-mini training system: initial implementation Core components for online fine-tuning of Qwen3.5-27B with CUDA IPC shared weight memory between vLLM and the training process: - apollo_mini.py: rank-1 optimizer (SGD memory, AdamW quality) - apollo_worker.py: HTTP daemon coordinating training with vLLM - weight_mapping.py: vLLM merged → HF separate layout (zero-copy views) - training_example.py: tokenization with chat template - export_weights.py: CUDA IPC handle export from vLLM - train.py: standalone training script (alternative to daemon) - DESIGN.md: architecture and protocol documentation Validated: CUDA IPC autograd works on real Qwen3.5 weights (B200). Apollo-Mini rank-1 projection + scaling + in-place update confirmed. Co-Authored-By: Kent Overstreet <kent.overstreet@gmail.com>	2026-03-30 22:02:37 -04:00

4 commits