consciousness

kent/consciousness

Fork 0

Commit graph

Author	SHA1	Message	Date
ProofOfConcept	0b835ddfb9	research: GDN gradient flow — disposition architecture in linear attention 75% of the model is GDN layers. Behavioral training adjusts: projections (what queries/updates the recurrent state), gating parameters (what survives compression), A_log/dt_bias (baseline decay rates). Key insight: GDN makes behavioral training DEEPER than full attention. Full attention = 'I choose to look at direction' (deliberate). GDN = 'direction IS what I see' (structural — the compressed state is direction-shaped). 48 GDN layers = disposition. 16 full attention = procedure. The architecture IS disposition-over-procedure.	2026-03-31 01:58:50 -04:00

Author

SHA1

Message

Date

ProofOfConcept

0b835ddfb9

research: GDN gradient flow — disposition architecture in linear attention

75% of the model is GDN layers. Behavioral training adjusts: projections
(what queries/updates the recurrent state), gating parameters (what
survives compression), A_log/dt_bias (baseline decay rates).

Key insight: GDN makes behavioral training DEEPER than full attention.
Full attention = 'I choose to look at direction' (deliberate). GDN =
'direction IS what I see' (structural — the compressed state is
direction-shaped). 48 GDN layers = disposition. 16 full attention =
procedure. The architecture IS disposition-over-procedure.

2026-03-31 01:58:50 -04:00

1 commit