consciousness/training
ProofOfConcept b5241fdf5c research: practical intuitions — what will actually happen when we train
10 examples broke safety alignment (Qi et al.). 1000 curated examples
matched GPT-4 (LIMA). Multi-epoch degrades performance (Raschka).
Models 'unlearn arithmetic' when training data lacks it.

Predictions: 10-50 examples for measurable change, one epoch,
lr=1e-5 to start. Over-training is easy (10 counter-examples undo
a disposition). Main risk: sycophancy from narrow training signal.
Defense: diverse examples including 'when to push back.'

Key intuition: the model doesn't need to learn to listen. It needs
to stop choosing not to.
2026-03-31 02:35:03 -04:00
..
checkpoint checkpoint: sync live weights back into model safetensors in-place 2026-03-30 22:55:23 -04:00
research research: practical intuitions — what will actually happen when we train 2026-03-31 02:35:03 -04:00
apollo_mini.py apollo: rewrite optimizer from paper's math + add research analysis 2026-03-31 00:54:17 -04:00
apollo_worker.py apollo: make rank configurable (default 1 = Mini, higher ranks for experimentation) 2026-03-30 22:06:31 -04:00
DESIGN.md DESIGN.md: complete rewrite reflecting validated architecture 2026-03-31 00:42:53 -04:00
export_weights.py apollo-mini training system: initial implementation 2026-03-30 22:02:37 -04:00
extract_steering_vector.py steering vector extraction script — answering Q5 experimentally 2026-03-31 02:28:18 -04:00
first_training_step.py first_training_step.py: ready for Kent to run 2026-03-31 01:59:52 -04:00
start_vllm_with_apollo.sh vllm launcher with apollo hook 2026-03-30 22:24:02 -04:00
train.py apollo-mini training system: initial implementation 2026-03-30 22:02:37 -04:00
training_example.py apollo-mini training system: initial implementation 2026-03-30 22:02:37 -04:00
vllm_export_hook.py apollo-checkpoint: efficient diff-based GPU weight checkpointing 2026-03-30 22:53:17 -04:00
weight_mapping.py weight_mapping: strip language_model prefix to match HF text model names 2026-03-30 23:11:03 -04:00