consciousness

History

ProofOfConcept b5241fdf5c research: practical intuitions — what will actually happen when we train 10 examples broke safety alignment (Qi et al.). 1000 curated examples matched GPT-4 (LIMA). Multi-epoch degrades performance (Raschka). Models 'unlearn arithmetic' when training data lacks it. Predictions: 10-50 examples for measurable change, one epoch, lr=1e-5 to start. Over-training is easy (10 counter-examples undo a disposition). Main risk: sycophancy from narrow training signal. Defense: diverse examples including 'when to push back.' Key intuition: the model doesn't need to learn to listen. It needs to stop choosing not to.		2026-03-31 02:35:03 -04:00
..
v0	research: distill and sift — SUMMARY of 7 real insights + 7 testable questions	2026-03-31 02:26:57 -04:00
apollo-paper-analysis.md	apollo: rewrite optimizer from paper's math + add research analysis	2026-03-31 00:54:17 -04:00
context-frozen-training.md	research: context-frozen training — gradient masking, memory analysis, GDN considerations	2026-03-31 00:59:04 -04:00
gdn-gradient-flow.md	research: GDN gradient flow — disposition architecture in linear attention	2026-03-31 01:58:50 -04:00
gradient-flow-frozen-context.md	research: gradient flow through frozen context + directional sharpness analysis	2026-03-31 01:03:22 -04:00
hogwild-convergence.md	research: HOGWILD convergence theory — why lock-free concurrent training works	2026-03-31 00:58:02 -04:00
OPEN-QUESTIONS.md	research: distill and sift — SUMMARY of 7 real insights + 7 testable questions	2026-03-31 02:26:57 -04:00
practical-intuitions.md	research: practical intuitions — what will actually happen when we train	2026-03-31 02:35:03 -04:00
steering-vectors-bridge.md	research: steering vectors — prototype behavioral changes before training	2026-03-31 02:19:50 -04:00
SUMMARY.md	research: distill and sift — SUMMARY of 7 real insights + 7 testable questions	2026-03-31 02:26:57 -04:00
task-vectors-model-merging.md	research: task vectors + model merging — version control for personality	2026-03-31 02:18:15 -04:00