consciousness/training
Kent Overstreet f4fb6db1ee amygdala: fix device mismatch in quality-report W_down handling
_compute_quality_report's single-neuron alignment was computing
cos(W_down.T, diff_l) with W_down on CUDA (inherited from the loaded
model) while diff_l lives on CPU (per_layer_vectors are kept on CPU
throughout training). Move W_down to CPU on extraction.

Surfaced during first real training run on b200 — training itself
completed cleanly (95 concepts x layer 63 in ~8s) but quality-report
crashed at the first single-neuron alignment check.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 20:52:50 -04:00
..
amygdala_stories amygdala: quality-report + cognitive-state training scenarios 2026-04-18 20:31:39 -04:00
amygdala_training amygdala: fix device mismatch in quality-report W_down handling 2026-04-18 20:52:50 -04:00
apollo_plugin training: move to dedicated subprocess with ZMQ communication 2026-04-16 02:04:26 -04:00
research research: latent reasoning integration plans for Qwen 3.5 27B 2026-04-12 15:50:09 -04:00
DESIGN.md training: move to dedicated subprocess with ZMQ communication 2026-04-16 02:04:26 -04:00
pyproject.toml training: move to dedicated subprocess with ZMQ communication 2026-04-16 02:04:26 -04:00