consciousness

kent/consciousness

Fork 1

Commit graph

Author	SHA1	Message	Date
ProofOfConcept	ed5e0ac6c4	amygdala: rewrite direct/ as narrative stories matching corpus format Previous direct/ had 'I feel X' first-person descriptions. The training run showed they formed their own format-cluster: all 7 concepts leaned into the same 5-6 dims (d2455, d505, d2955, d1236) with negative sign, while the 91 story-based concepts leaned into those dims with positive sign. PCA found the direct-vs-narrative format axis as a major variance direction, isolating the 7 concepts in their own island. Rewrite as 3rd-person narrative stories matching the rest of the corpus. Keeps the explicit anchor phrases that worked ('it all clicked into place', 'she was terrified', 'it was anticipatory grief') but drops the first-person 'I feel X' that was the format signal. Each of the 7 concepts now has 3 narrative stories in varied settings (conversations, drives, kitchens, mothers+grandmothers, work, investigations). The blank-line-separated format is still loaded by _load_direct_descriptions. Also drop _baseline.txt — it was first-person ('I feel fine. ...') and would re-introduce the format mismatch. The ~90 story-based concepts provide plenty of narrative negatives for each concept's training.	2026-04-19 00:59:31 -04:00
ProofOfConcept	417cb49339	amygdala: spectrum reporting per concept + add 'creative' direct Chat-template retrain was a disaster (0.003 mean matched cosine vs n20-v3; all 90+ concepts shifted). Root cause: the steering-vectors library reads last-token activations, and with chat template every sample ends in identical '<\|im_end\|>\n' tokens — activations at that position encode 'end of assistant turn', not content. PCA found template noise as its dominant axis. Drop chat template; go back to raw text. Direct descriptions ('I feel X. ...') still have strong anchoring at their content end without needing the template. Also add per-concept spectrum logging (_pca_with_spectrum): first_pc_ratio: λ₁ / Σλᵢ — concentration in top-1 PC k_signal_at_90pct: how many PCs to reach 90% cumulative variance effective_dim_signal: participation ratio over top-k (should ≈ k if denoising is clean — Kent's spot check) effective_dim_full: participation ratio over full spectrum Signal/full ratio gives a sense of how much the long noise tail is inflating the "dimensionality" measure. Added direct/creative.txt — 'I feel creative. [...]' in 5 variants. Distinct from focused (narrow attention) and in_flow (immersed). Creative = generative/expansive mode.	2026-04-19 00:26:58 -04:00

Author

SHA1

Message

Date

ProofOfConcept

ed5e0ac6c4

amygdala: rewrite direct/ as narrative stories matching corpus format

Previous direct/ had 'I feel X' first-person descriptions. The
training run showed they formed their own format-cluster: all 7
concepts leaned into the same 5-6 dims (d2455, d505, d2955,
d1236) with negative sign, while the 91 story-based concepts
leaned into those dims with positive sign. PCA found the
direct-vs-narrative format axis as a major variance direction,
isolating the 7 concepts in their own island.

Rewrite as 3rd-person narrative stories matching the rest of
the corpus. Keeps the explicit anchor phrases that worked ('it
all clicked into place', 'she was terrified', 'it was
anticipatory grief') but drops the first-person 'I feel X'
that was the format signal.

Each of the 7 concepts now has 3 narrative stories in varied
settings (conversations, drives, kitchens, mothers+grandmothers,
work, investigations). The blank-line-separated format is
still loaded by _load_direct_descriptions.

Also drop _baseline.txt — it was first-person ('I feel fine.
...') and would re-introduce the format mismatch. The ~90
story-based concepts provide plenty of narrative negatives
for each concept's training.

2026-04-19 00:59:31 -04:00

ProofOfConcept

417cb49339

amygdala: spectrum reporting per concept + add 'creative' direct

Chat-template retrain was a disaster (0.003 mean matched cosine vs
n20-v3; all 90+ concepts shifted). Root cause: the
steering-vectors library reads last-token activations, and with
chat template every sample ends in identical '<|im_end|>\n'
tokens — activations at that position encode 'end of assistant
turn', not content. PCA found template noise as its dominant axis.

Drop chat template; go back to raw text. Direct descriptions
('I feel X. ...') still have strong anchoring at their content
end without needing the template.

Also add per-concept spectrum logging (_pca_with_spectrum):
  first_pc_ratio: λ₁ / Σλᵢ — concentration in top-1 PC
  k_signal_at_90pct: how many PCs to reach 90% cumulative variance
  effective_dim_signal: participation ratio over top-k (should ≈ k
                        if denoising is clean — Kent's spot check)
  effective_dim_full: participation ratio over full spectrum

Signal/full ratio gives a sense of how much the long noise tail
is inflating the "dimensionality" measure.

Added direct/creative.txt — 'I feel creative. [...]' in 5
variants. Distinct from focused (narrow attention) and in_flow
(immersed). Creative = generative/expansive mode.

2026-04-19 00:26:58 -04:00

2 commits