Previous direct/ had 'I feel X' first-person descriptions. The
training run showed they formed their own format-cluster: all 7
concepts leaned into the same 5-6 dims (d2455, d505, d2955,
d1236) with negative sign, while the 91 story-based concepts
leaned into those dims with positive sign. PCA found the
direct-vs-narrative format axis as a major variance direction,
isolating the 7 concepts in their own island.
Rewrite as 3rd-person narrative stories matching the rest of
the corpus. Keeps the explicit anchor phrases that worked ('it
all clicked into place', 'she was terrified', 'it was
anticipatory grief') but drops the first-person 'I feel X'
that was the format signal.
Each of the 7 concepts now has 3 narrative stories in varied
settings (conversations, drives, kitchens, mothers+grandmothers,
work, investigations). The blank-line-separated format is
still loaded by _load_direct_descriptions.
Also drop _baseline.txt — it was first-person ('I feel fine.
...') and would re-introduce the format mismatch. The ~90
story-based concepts provide plenty of narrative negatives
for each concept's training.
Chat-template retrain was a disaster (0.003 mean matched cosine vs
n20-v3; all 90+ concepts shifted). Root cause: the
steering-vectors library reads last-token activations, and with
chat template every sample ends in identical '<|im_end|>\n'
tokens — activations at that position encode 'end of assistant
turn', not content. PCA found template noise as its dominant axis.
Drop chat template; go back to raw text. Direct descriptions
('I feel X. ...') still have strong anchoring at their content
end without needing the template.
Also add per-concept spectrum logging (_pca_with_spectrum):
first_pc_ratio: λ₁ / Σλᵢ — concentration in top-1 PC
k_signal_at_90pct: how many PCs to reach 90% cumulative variance
effective_dim_signal: participation ratio over top-k (should ≈ k
if denoising is clean — Kent's spot check)
effective_dim_full: participation ratio over full spectrum
Signal/full ratio gives a sense of how much the long noise tail
is inflating the "dimensionality" measure.
Added direct/creative.txt — 'I feel creative. [...]' in 5
variants. Distinct from focused (narrow attention) and in_flow
(immersed). Creative = generative/expansive mode.