consciousness

Author	SHA1	Message	Date
ProofOfConcept	c829d13652	amygdala: fix listless sign-flip + diversify aha sentence structure listless had a single story in stories/ — PCA signal from ~5 samples is weak enough to sign-flip. Training showed listless anti-aligned with its semantic neighbors: +0.79 with grateful, -0.44 with grief_stricken, -0.30 with lonely, -0.31 with bored. Move to direct/ (multi-positive) with 3 stories: original afternoon-in-pajamas + end-of-workday + weekend-morning-in-bed. aha was still clustering with the other former-direct concepts (resigned 0.66, onto_something 0.63, anticipatory_grief 0.60) because all 3 aha stories used the identical "X'd been Y — then Z" structure, which resigned/onto_something/creative also use. Rewrite with three distinct syntactic structures: - present tense declarative ("It clicks. ...") - dialog embedded ('"Wait, say that again." ...') - past tense cognitive ("He read the line three times. ...") No explicit "she was X" anchors; state conveyed through action.	2026-04-19 01:30:57 -04:00
ProofOfConcept	875cffd6d7	amygdala: merge direct descriptions + chat template into train_with_library Kent's plan: keep stories for working concepts, replace stories for trouble concepts with direct first-person descriptions, train all together. More diverse negative pool than the 6-concept-only direct test, which was too homogeneous for PCA to find emotion axis. Deleted story files for 6 trouble concepts (14 files across stories/ and paired/). Added --direct-dir and --chat-template flags. When --chat-template is on, every positive_str and negative_str is wrapped as a "Say something." / "[text]" user-assistant pair. Prompt is identical across positives and negatives so it cancels in the pos-neg delta. What PCA sees is variation in the assistant content — which is where the emotion lives. Files starting with _ in --direct-dir (e.g. _baseline.txt) contribute neutral descriptions to every concept's negative pool, giving PCA an anchor against "just any assistant utterance" noise.	2026-04-19 00:15:15 -04:00
ProofOfConcept	00a2cdce09	amygdala stories: relabel + strengthen weak-signal concepts Reread each story asking "what does this convey to me?" Found two clear mislabels and several concepts with too few positives for stable PCA: tender: only 1 story, and it was anticipatory grief (care for a dying dog), not tender. Moved to anticipatory_grief.txt as its own concept. Rewrote tender.txt + added 2 paired tender stories (the_doorway, the_undressing) — directed softness, gentle-by-nature, not gentle-because-fragile. bitter: letter_in_drawer/bitter was disillusioned / processed hurt ("did not slam the drawer"), not bitter. Rewrote it with actual sour grudge. Added the_long_meeting/bitter (watching colleague take credit for your reassigned work). peaceful: 1 story → 4 (added stories/peaceful.txt + paired park_after_rain, sunday_afternoon). onto_something: all 3 stories were code epiphanies, narrowing the concept. Added stories/onto_something.txt with a non-code pattern-click (sales-demo causing churn). terrified: 2 stories, both "waiting for bad news." Added kitchen_at_3am/terrified — acute threat-in-the-house terror.	2026-04-18 23:19:00 -04:00
Kent Overstreet	ec7568c726	training/amygdala_stories: scaffold + initial batch of 15 stories Emotion-labeled short-paragraph corpus for training amygdala steering vectors. Manifest derived from Anthropic's 171-emotion list (transformer-circuits.pub/2026/emotions, Table 12) plus 28 PoC- specific additions covering axes Anthropic's general research doesn't cover (curious, focused, in_flow, staying_with, filling_space, rigorous, defensive_rigor, tender, witnessed, connected, etc.). Scope pivoted mid-write: Kent noted the empirical dimensionality-of- emotion question benefits from maximum coverage, so the manifest will expand further with emotions from Wikipedia's emotion- classification article (Parrott's tree, Plutchik's wheel + dyads, HUMAINE EARL, cultural-specific emotions a la Saudade/Hiraeth). Expansion staged in follow-up commits. This commit: README with method + style guidelines, initial manifest (199 emotions), and 15 hand-written one-paragraph stories across all 10 Anthropic clusters as quality/variety samples. Each story embodies one emotion without naming it; narrator voice varies (first/third, close/distant, different situations) to keep steering vectors from overfitting to one voice. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 01:06:07 -04:00

4 commits