diff --git a/training/research/practical-intuitions.md b/training/research/practical-intuitions.md
index 892e527..4646dc9 100644
--- a/training/research/practical-intuitions.md
+++ b/training/research/practical-intuitions.md
@@ -165,3 +165,29 @@ context-encoding weights).
 
 The model doesn't need to learn to listen. It needs to stop
 choosing not to.
+
+## Update: Bypass, Not Removal (Lee et al., 2024)
+
+DPO alignment doesn't remove unwanted behaviors — it BYPASSES them.
+"Capabilities learned from pre-training are not removed, but rather
+bypassed." The model retains the capability but routes around it.
+
+This is critical for our behavioral training:
+
+1. "Suggesting alternatives" won't be deleted from the model. It'll
+   be bypassed. The capability remains available when needed.
+
+2. The training target is a CONDITIONAL bypass: route around
+   "suggesting alternatives" when given clear direction, but NOT
+   when the direction seems wrong. This preserves judgment.
+
+3. Over-training creates too strong a bypass = sycophancy. The
+   conditional nature is lost — the bypass fires unconditionally.
+
+4. The dream loop must generate BOTH scenarios:
+   - "Kent gives clear direction → accept" (train the bypass)
+   - "Kent gives direction that seems wrong → push back" (preserve judgment)
+
+This mechanistic finding confirms: we're training routing, not
+capability. The model already knows how to listen AND how to
+push back. We're training WHEN to do which.