Kent's plan: keep stories for working concepts, replace stories for
trouble concepts with direct first-person descriptions, train all
together. More diverse negative pool than the 6-concept-only direct
test, which was too homogeneous for PCA to find emotion axis.
Deleted story files for 6 trouble concepts (14 files across stories/
and paired/). Added --direct-dir and --chat-template flags.
When --chat-template is on, every positive_str and negative_str is
wrapped as a "Say something." / "[text]" user-assistant pair. Prompt
is identical across positives and negatives so it cancels in the
pos-neg delta. What PCA sees is variation in the assistant content —
which is where the emotion lives.
Files starting with _ in --direct-dir (e.g. _baseline.txt) contribute
neutral descriptions to every concept's negative pool, giving PCA an
anchor against "just any assistant utterance" noise.
Training on 537c72bd46 showed grief_stricken successfully broke
out of the cozy cluster, but content (single scenario:
sunday_afternoon) took its place — pulled into couch-blanket
phenomenology at cosine 0.68-0.82 with cozy/sensual/resigned.
Same fix: spread each concept across multiple settings so PCA
has to find the valence axis, not the scene axis.
content: + finishing_the_patch, the_writing_session, park_after_rain
resigned: + the_comment, the_long_meeting
Resigned had 2 scenarios (sunday_afternoon, waiting_for_results)
— both about accepting something unwanted in a slow/private
context. Adding work-context resigned (PR review you lost,
restructuring meeting) should pull it out of that cluster.
Target the emotion families that failed to cluster in the initial
training round (layer-wise validation showed them anti-clustered or
scattered at deep layers): anger, high-arousal positive, sexual
range, social positive. Paired scenarios hold content constant and
vary only the emotional framing — the cleanest training signal for
CAA, should produce directions that capture affect rather than
topic.
* the_comment: a PR review comment. baseline, furious, bitter,
resentful, defeated.
* the_green_build: 11-day bug finally fixed, tests pass. baseline,
triumphant, blissful, excited, proud.
* the_undressing: partner entering the bedroom for the night.
baseline, horny, anticipatory_sexual, yearning_sexual,
exuberant_sexual, devotional_sexual.
* the_doorway: friend leaving at the end of a long evening.
baseline, grateful, admiring, compassionate, loving, connected.
22 stories total. Retrain and re-validate: expect anger,
high_pos, and social_pos clusters to flip from anti- to positively
cohesive at deep layers, and sexual cluster to tighten.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>