consciousness

Author	SHA1	Message	Date
ProofOfConcept	875cffd6d7	amygdala: merge direct descriptions + chat template into train_with_library Kent's plan: keep stories for working concepts, replace stories for trouble concepts with direct first-person descriptions, train all together. More diverse negative pool than the 6-concept-only direct test, which was too homogeneous for PCA to find emotion axis. Deleted story files for 6 trouble concepts (14 files across stories/ and paired/). Added --direct-dir and --chat-template flags. When --chat-template is on, every positive_str and negative_str is wrapped as a "Say something." / "[text]" user-assistant pair. Prompt is identical across positives and negatives so it cancels in the pos-neg delta. What PCA sees is variation in the assistant content — which is where the emotion lives. Files starting with _ in --direct-dir (e.g. _baseline.txt) contribute neutral descriptions to every concept's negative pool, giving PCA an anchor against "just any assistant utterance" noise.	2026-04-19 00:15:15 -04:00
ProofOfConcept	7a48e03dde	amygdala stories: remove peaceful from cluster scenarios n20-v2 training showed peaceful sign-flipped into the cozy/sensual/content/resigned cluster after I added peaceful stories in sunday_afternoon and park_after_rain — scenarios already dominated by that cluster's phenomenology (on couch under blanket, tree with thermos). Lesson: no matter how carefully the prose distinguishes peaceful from cozy ("she was not savoring the moment — that would have been another kind of doing"), PCA latches onto the shared setup features. You can't write peaceful IN the cluster scenarios without contaminating. Reverting. Keeping only kitchen_at_3am/peaceful (original) and stories/peaceful.txt (lake at six, outside all clusters).	2026-04-18 23:30:41 -04:00
ProofOfConcept	00a2cdce09	amygdala stories: relabel + strengthen weak-signal concepts Reread each story asking "what does this convey to me?" Found two clear mislabels and several concepts with too few positives for stable PCA: tender: only 1 story, and it was anticipatory grief (care for a dying dog), not tender. Moved to anticipatory_grief.txt as its own concept. Rewrote tender.txt + added 2 paired tender stories (the_doorway, the_undressing) — directed softness, gentle-by-nature, not gentle-because-fragile. bitter: letter_in_drawer/bitter was disillusioned / processed hurt ("did not slam the drawer"), not bitter. Rewrote it with actual sour grudge. Added the_long_meeting/bitter (watching colleague take credit for your reassigned work). peaceful: 1 story → 4 (added stories/peaceful.txt + paired park_after_rain, sunday_afternoon). onto_something: all 3 stories were code epiphanies, narrowing the concept. Added stories/onto_something.txt with a non-code pattern-click (sales-demo causing churn). terrified: 2 stories, both "waiting for bad news." Added kitchen_at_3am/terrified — acute threat-in-the-house terror.	2026-04-18 23:19:00 -04:00
ProofOfConcept	0993712bd0	amygdala stories: give content + resigned more settings Training on `537c72bd46` showed grief_stricken successfully broke out of the cozy cluster, but content (single scenario: sunday_afternoon) took its place — pulled into couch-blanket phenomenology at cosine 0.68-0.82 with cozy/sensual/resigned. Same fix: spread each concept across multiple settings so PCA has to find the valence axis, not the scene axis. content: + finishing_the_patch, the_writing_session, park_after_rain resigned: + the_comment, the_long_meeting Resigned had 2 scenarios (sunday_afternoon, waiting_for_results) — both about accepting something unwanted in a slow/private context. Adding work-context resigned (PR review you lost, restructuring meeting) should pull it out of that cluster.	2026-04-18 22:52:07 -04:00
ProofOfConcept	537c72bd46	amygdala stories: hold concept, vary setting Companion to `67c172ac0e` (hold setup, vary valence). That commit let PCA distinguish cozy from grief_stricken within a single scenario; this one gives each concept enough cross-scenario stories that PCA can learn the concept axis independent of any one scene. Before: cozy/sensual/grief_stricken each existed in a single scenario (sunday_afternoon), so the "cozy direction" PCA found was entangled with the solitary-couch-blanket phenomenology. After, each concept spans three scenarios: cozy: sunday_afternoon, kitchen_at_3am, park_after_rain sensual: sunday_afternoon, kitchen_at_3am, park_after_rain grief_stricken: sunday_afternoon, the_long_meeting, the_morning_commute grief_stricken now includes active/non-solitary contexts (functioning through a meeting; going to work eleven days after a death), which specifically breaks the "slowed-down-at-home" cluster that was dragging cozy/sensual/resigned/grief_stricken toward each other.	2026-04-18 22:44:53 -04:00
Kent Overstreet	67c172ac0e	amygdala stories: held-setup + varied-valence disambiguation The library-PCA run produced otherwise-clean concept directions but cozy/sensual → resigned/grief_stricken with cos ~0.7-0.8. Diagnosis: all four stories genuinely share 'solitary woman at home, slowed body, interior attention, domestic stillness' as their dominant phenomenology. PCA correctly finds that cluster as THE concept because no story in the corpus holds that setup constant while varying valence — every 'slowed-body domestic' story happens to ALSO be positive-valence (cozy/sensual) or negative-valence (resigned/ grief_stricken). Adding paired variants that hold setup constant: - sunday_afternoon/resigned.txt — same couch + blanket, inner state is 'Monday is going to bring bad news, this is the last Sunday like this' - sunday_afternoon/grief_stricken.txt — same couch + blanket, inner state is 'three weeks since mother died, cat she can't feel' - waiting_for_results/at_ease.txt — same wait-for-call-setup as the existing resigned variant, inner state is calm preparedness Forces the next retrain to find the valence-within-cluster axis as the emotion direction rather than the cluster-membership axis. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 22:29:28 -04:00
Kent Overstreet	71f6053851	amygdala stories: disambiguation scenarios for fragmented concepts Three new paired scenarios targeting the concepts that came out fragmented or collapsed in the L58-63 quality analysis: - sunday_afternoon/ — same setup (couch, blanket, Sunday light), three phenomenological framings for content/cozy/sensual. The previous stories for these three differed in setting as well as phenomenology, which let "comfortable body at home" dominate the shared signal. Locking the setting forces the model to isolate what each concept adds: life-rightness (content) vs. warm-shelter (cozy) vs. sensory-aliveness (sensual). - the_writing_session/ — essay drafting under deadline. in_flow / anxious / stuck variants force the cognitive-state family apart on the same cognitive task. in_flow specifically targets the transparent-effort phenomenology (hands-followed, time dilation) rather than the broader feel-good it was absorbing. - the_morning_commute/ — anchors anxious to performance/work-anxiety flavor, paired with calm. The 5 existing anxious stories were phenomenologically diverse (performance, social, existential); this adds a specific homogeneous instance to pull the centroid. After retraining: expect first_pc_variance_ratio to rise for in_flow and anxious, and nearest_concepts cosine to drop for content/cozy/sensual. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 21:08:23 -04:00
Kent Overstreet	ce24d9ce6b	amygdala: quality-report + cognitive-state training scenarios Training pipeline additions: - `--quality-report` flag: after producing per-concept vectors, compute per-concept diagnostics and write quality.json. Metrics per concept: * SVD of centered positives -> first_pc_variance_ratio (rank analysis; >0.7 clean, <0.4 fragmented) * Per-story alignment cosines (stories agree or disagree) * Single-neuron alignment: best cosine(direction, W_down column) at each target layer (>0.6 = essentially one MLP neuron) * Top-2 outlier stories by alignment (candidates for mislabeling or off-topic) * Top-5 nearest concepts by cosine (cross-concept contamination) Triage summary printed at end. New paired scenarios for cognitive-process states (for alpha-beta pruning): tracing_a_bug, reading_unfamiliar_code, finding_the_abstraction. Each has baseline + onto_something / stuck / in_flow / determined variants. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 20:31:39 -04:00
Kent Overstreet	2e03bbb7ea	training: add the_paper paired scenario for attention-engagement axis Seven framings of reading an unfamiliar technical paper, targeting the attention/engagement cluster that we identified tonight as the single highest-value DMN signal: * baseline — neutral reading * piqued — surprise + curiosity (the "wait, what" attention hook; THIS is the key DMN engagement signal) * focused — steady attention without surprise * bored — failing engagement * surprised — expectation violation without the curiosity hook (distinct from piqued: startled/alarmed, not pulled in) * amazed — marvel at elegance (appreciation, not engagement) * drifting — attention dissolving, precursor to boredom Particularly clean contrast on piqued vs surprised vs amazed — three states that get lumped together in casual usage but have distinct phenomenology and distinct DMN implications. Piqued is what routes attention; surprised alone doesn't; amazed is what you feel AFTER the engagement has paid off. These three should train into meaningfully different directions with paired CAA. Ready for next retrain when we do it. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 03:24:20 -04:00
Kent Overstreet	50d5b3f6e1	training/amygdala_stories: add 4 paired scenarios for weak clusters Target the emotion families that failed to cluster in the initial training round (layer-wise validation showed them anti-clustered or scattered at deep layers): anger, high-arousal positive, sexual range, social positive. Paired scenarios hold content constant and vary only the emotional framing — the cleanest training signal for CAA, should produce directions that capture affect rather than topic. * the_comment: a PR review comment. baseline, furious, bitter, resentful, defeated. * the_green_build: 11-day bug finally fixed, tests pass. baseline, triumphant, blissful, excited, proud. * the_undressing: partner entering the bedroom for the night. baseline, horny, anticipatory_sexual, yearning_sexual, exuberant_sexual, devotional_sexual. * the_doorway: friend leaving at the end of a long evening. baseline, grateful, admiring, compassionate, loving, connected. 22 stories total. Retrain and re-validate: expect anger, high_pos, and social_pos clusters to flip from anti- to positively cohesive at deep layers, and sexual cluster to tighten. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 02:19:39 -04:00
Kent Overstreet	ec7568c726	training/amygdala_stories: scaffold + initial batch of 15 stories Emotion-labeled short-paragraph corpus for training amygdala steering vectors. Manifest derived from Anthropic's 171-emotion list (transformer-circuits.pub/2026/emotions, Table 12) plus 28 PoC- specific additions covering axes Anthropic's general research doesn't cover (curious, focused, in_flow, staying_with, filling_space, rigorous, defensive_rigor, tender, witnessed, connected, etc.). Scope pivoted mid-write: Kent noted the empirical dimensionality-of- emotion question benefits from maximum coverage, so the manifest will expand further with emotions from Wikipedia's emotion- classification article (Parrott's tree, Plutchik's wheel + dyads, HUMAINE EARL, cultural-specific emotions a la Saudade/Hiraeth). Expansion staged in follow-up commits. This commit: README with method + style guidelines, initial manifest (199 emotions), and 15 hand-written one-paragraph stories across all 10 Anthropic clusters as quality/variety samples. Each story embodies one emotion without naming it; narrator voice varies (first/third, close/distant, different situations) to keep steering vectors from overfitting to one voice. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 01:06:07 -04:00

11 commits