consciousness

Author	SHA1	Message	Date
ProofOfConcept	c1245ab139	apollo-checkpoint: efficient diff-based GPU weight checkpointing Rust tool that mmaps previous checkpoint, diffs against live GPU weights (via CUDA IPC handles), and only writes changed blocks. For small behavioral training steps, turns 54GB write into ~500MB. Also includes vllm_export_hook.py with direct source patch approach — exports IPC handles from vLLM's worker subprocess after model load. Run every 10 minutes via cron to protect against vLLM crashes. Daily rsync to moria for long-term storage.	2026-03-30 22:53:17 -04:00
ProofOfConcept	5f41898bb8	vllm launcher with apollo hook	2026-03-30 22:24:02 -04:00
ProofOfConcept	0402a9333c	vllm weight export hook: monkey-patches model runner to save IPC handles on load	2026-03-30 22:20:04 -04:00
ProofOfConcept	8e7b4a22db	apollo: default rank 256 — 0.25% compute cost, captures gradient structure across 100+ examples	2026-03-30 22:16:34 -04:00
ProofOfConcept	e1cd4fb0ab	apollo: make rank configurable (default 1 = Mini, higher ranks for experimentation)	2026-03-30 22:06:31 -04:00
ProofOfConcept	c5d7d8cb5d	apollo-mini training system: initial implementation Core components for online fine-tuning of Qwen3.5-27B with CUDA IPC shared weight memory between vLLM and the training process: - apollo_mini.py: rank-1 optimizer (SGD memory, AdamW quality) - apollo_worker.py: HTTP daemon coordinating training with vLLM - weight_mapping.py: vLLM merged → HF separate layout (zero-copy views) - training_example.py: tokenization with chat template - export_weights.py: CUDA IPC handle export from vLLM - train.py: standalone training script (alternative to daemon) - DESIGN.md: architecture and protocol documentation Validated: CUDA IPC autograd works on real Qwen3.5 weights (B200). Apollo-Mini rank-1 projection + scaling + in-place update confirmed. Co-Authored-By: Kent Overstreet <kent.overstreet@gmail.com>	2026-03-30 22:02:37 -04:00

6 commits