kill .claude

This commit is contained in:
Kent Overstreet 2026-04-09 20:00:05 -04:00
parent 929415af3b
commit ff5be3e792
6 changed files with 0 additions and 0 deletions

View file

@ -0,0 +1,202 @@
# Daemon & Jobkit Architecture Survey
_2026-03-14, autonomous survey while Kent debugs discard FIFO_
## Current state
daemon.rs is 1952 lines mixing three concerns:
- ~400 lines: pure jobkit usage (spawn, depend_on, resource)
- ~600 lines: logging/monitoring (log_event, status, RPC)
- ~950 lines: job functions embedding business logic
## What jobkit provides (good)
- Worker pool with named workers
- Dependency graph: `depend_on()` for ordering
- Resource pools: `ResourcePool` for concurrency gating (LLM slots)
- Retry logic: `retries(N)` on `TaskError::Retry`
- Task status tracking: `choir.task_statuses()``Vec<TaskInfo>`
- Cancellation: `ctx.is_cancelled()`
## What jobkit is missing
### 1. Structured logging (PRIORITY)
- Currently dual-channel: `ctx.log_line()` (per-task) + `log_event()` (daemon JSONL)
- No log levels, no structured context, no correlation IDs
- Log rotation is naive (truncate at 1MB, keep second half)
- Need: observability hooks that both human TUI and AI can consume
### 2. Metrics (NONE EXIST)
- No task duration histograms
- No worker utilization tracking
- No queue depth monitoring
- No success/failure rates by type
- No resource pool wait times
### 3. Health monitoring
- No watchdog timers
- No health check hooks per job
- No alerting on threshold violations
- Health computed on-demand in daemon, not in jobkit
### 4. RPC (ad-hoc in daemon, should be schematized)
- Unix socket with string matching: `match cmd.as_str()`
- No cap'n proto schema for daemon control
- No versioning, no validation, no streaming
## Architecture problems
### Tangled concerns
Job functions hardcode `log_event()` calls. Graph health is in daemon
but uses domain-specific metrics. Store loading happens inside jobs
(10 agent runs = 10 store loads). Not separable.
### Magic numbers
- Workers = `llm_concurrency + 3` (line 682)
- 10 max new jobs per tick (line 770)
- 300/1800s backoff range (lines 721-722)
- 1MB log rotation (line 39)
- 60s scheduler interval (line 24)
None configurable.
### Hardcoded pipeline DAG
Daily pipeline phases are `depend_on()` chains in Rust code (lines
1061-1109). Can't adjust without recompile. No visualization. No
conditional skipping of phases.
### Task naming is fragile
Names used as both identifiers AND for parsing in TUI. Format varies
(colons, dashes, dates). `task_group()` splits on '-' to categorize —
brittle.
### No persistent task queue
Restart loses all pending tasks. Session watcher handles this via
reconciliation (good), but scheduler uses `last_daily` date from file.
## What works well
1. **Reconciliation-based session discovery** — elegant, restart-resilient
2. **Resource pooling** — LLM concurrency decoupled from worker count
3. **Dependency-driven pipeline** — clean DAG via `depend_on()`
4. **Retry with backoff** — exponential 5min→30min, resets on success
5. **Graceful shutdown** — SIGINT/SIGTERM handled properly
## Kent's design direction
### Event stream, not log files
One pipeline, multiple consumers. TUI renders for humans, AI consumes
structured data. Same events, different renderers. Cap'n Proto streaming
subscription: `subscribe(filter) -> stream<Event>`.
"No one ever thinks further ahead than log files with monitoring and
it's infuriating." — Kent
### Extend jobkit, don't add a layer
jobkit already has the scheduling and dependency graph. Don't create a
new orchestration layer — add the missing pieces (logging, metrics,
health, RPC) to jobkit itself.
### Cap'n Proto for everything
Standard RPC definitions for:
- Status queries (what's running, pending, failed)
- Control (start, stop, restart, queue)
- Event streaming (subscribe with filter)
- Health checks
## The bigger picture: bcachefs as library
Kent's monitoring system in bcachefs (event_inc/event_inc_trace + x-macro
counters) is the real monitoring infrastructure. 1-1 correspondence between
counters (cheap, always-on dashboard via `fs top`) and tracepoints (expensive
detail, only runs when enabled). The x-macro enforces this — can't have one
without the other.
When the Rust conversion is complete, bcachefs becomes a library. At that
point, jobkit doesn't need its own monitoring — it uses the same counter/
tracepoint infrastructure. One observability system for everything.
**Implication for now:** jobkit monitoring just needs to be good enough.
JSON events, not typed. Don't over-engineer — the real infrastructure is
coming from the Rust conversion.
## Extraction: jobkit-daemon library (designed with Kent)
### Goes to jobkit-daemon (generic)
- JSONL event logging with size-based rotation
- Unix domain socket server + signal handling
- Status file writing (periodic JSON snapshot)
- `run_job()` wrapper (logging + progress + error mapping)
- Systemd service installation
- Worker pool setup from config
- Cap'n Proto RPC for control protocol
### Stays in poc-memory (application)
- All job functions (experience-mine, fact-mine, consolidation, etc.)
- Session watcher, scheduler, RPC command handlers
- GraphHealth, consolidation plan logic
### Interface design
- Cap'n Proto RPC for typed operations (submit, cancel, subscribe)
- JSON blob for status (inherently open-ended, every app has different
job types — typing this is the tracepoint mistake)
- Application registers: RPC handlers, long-running tasks, job functions
- ~50-100 lines of setup code, call `daemon.run()`
## Plan of attack
1. **Observability hooks in jobkit**`on_task_start/progress/complete`
callbacks that consumers can subscribe to
2. **Structured event type** — typed events with task ID, name, duration,
result, metadata. Not strings.
3. **Metrics collection** — duration histograms, success rates, queue
depth. Built on the event stream.
4. **Cap'n Proto daemon RPC schema** — replace ad-hoc socket protocol
5. **TUI consumes event stream** — same data as AI consumer
6. **Extract monitoring from daemon.rs** — the 600 lines of logging/status
become generic, reusable infrastructure
7. **Declarative pipeline config** — DAG definition in config, not code
## File reference
- `src/agents/daemon.rs` — 1952 lines, all orchestration
- Job functions: 96-553
- run_daemon(): 678-1143
- Socket/RPC: 1145-1372
- Status display: 1374-1682
- `src/tui.rs` — 907 lines, polls status socket every 2s
- `schema/memory.capnp` — 125 lines, data only, no RPC definitions
- `src/config.rs` — configuration loading
- External: `jobkit` crate (git dependency)
## Mistakes I made building this (learning notes)
_Per Kent's instruction: note what went wrong and WHY._
1. **Dual logging channels** — I added `log_event()` because `ctx.log_line()`
wasn't enough, instead of fixing the underlying abstraction. Symptom:
can't find a failed job without searching two places.
2. **Magic numbers** — I hardcoded constants because "I'll make them
configurable later." Later never came. Every magic number is a design
decision that should have been explicit.
3. **1952-line file** — daemon.rs grew organically because each new feature
was "just one more function." Should have extracted when it passed 500
lines. The pain of refactoring later is always worse than the pain of
organizing early.
4. **Ad-hoc RPC** — String matching seemed fine for 2 commands. Now it's 4
commands and growing, with implicit formats. Should have used cap'n proto
from the start — the schema IS the documentation.
5. **No tests** — Zero tests in daemon code. "It's a daemon, how do you test
it?" is not an excuse. The job functions are pure-ish and testable. The
scheduler logic is testable with a clock abstraction.
6. **Not using systemd** — There's a systemd service for the daemon.
I keep starting it manually with `poc-memory agent daemon start` and
accumulating multiple instances. Tonight: 4 concurrent daemons, 32
cores pegged at 95%, load average 92. USE SYSTEMD. That's what it's for.
`systemctl --user start poc-memory-daemon`. ONE instance. Managed.
Pattern: every shortcut was "just for now" and every "just for now" became
permanent. Kent's yelling was right every time.

View file

@ -0,0 +1,98 @@
# Link Strength Feedback Design
_2026-03-14, designed with Kent_
## The two signals
### "Not relevant" → weaken the EDGE
The routing failed. Search followed a link and arrived at a node that
doesn't relate to what I was looking for. The edge carried activation
where it shouldn't have.
- Trace back through memory-search's recorded activation path
- Identify which edge(s) carried activation to the bad result
- Weaken those edges by a conscious-scale delta (0.01)
### "Not useful" → weaken the NODE
The routing was correct but the content is bad. The node itself isn't
valuable — stale, wrong, poorly written, duplicate.
- Downweight the node (existing `poc-memory wrong` behavior)
- Don't touch the edges — the path was correct, the destination was bad
## Three tiers of adjustment
### Tier 1: Agent automatic (0.00001 per event)
- Agent follows edge A→B during a run
- If the run produces output that gets `used` → strengthen A→B
- If the run produces nothing useful → weaken A→B
- The agent doesn't know this is happening — daemon tracks it
- Clamped to [0.05, 0.95] — edges can never hit 0 or 1
- Logged: every adjustment recorded with (agent, edge, delta, timestamp)
### Tier 2: Conscious feedback (0.01 per event)
- `poc-memory not-relevant KEY` → trace activation path, weaken edges
- `poc-memory not-useful KEY` → downweight node
- `poc-memory used KEY` → strengthen edges in the path that got here
- 100x stronger than agent signal — deliberate judgment
- Still clamped, still logged
### Tier 3: Manual override (direct set)
- `poc-memory graph link-strength SRC DST VALUE` → set directly
- For when we know exactly what a strength should be
- Rare, but needed for bootstrapping / correction
## Implementation: recording the path
memory-search already computes the spread activation trace. Need to:
1. Record the activation path for each result (which edges carried how
much activation to arrive at this node)
2. Persist this per-session so `not-relevant` can look it up
3. The `record-hits` RPC already sends keys to the daemon — extend
to include (key, activation_path) pairs
## Implementation: agent tracking
In the daemon's job functions:
1. Before LLM call: record which nodes and edges the agent received
2. After LLM call: parse output for LINK/WRITE_NODE actions
3. If actions are created and later get `used` → the input edges were useful
4. If no actions or actions never used → the input edges weren't useful
5. This is a delayed signal — requires tracking across time
Simpler first pass: just track co-occurrence. If two nodes appear
together in a successful agent run, strengthen the edge between them.
No need to track which specific edge was "followed."
## Clamping
```rust
fn adjust_strength(current: f32, delta: f32) -> f32 {
(current + delta).clamp(0.05, 0.95)
}
```
Edges can asymptotically approach 0 or 1 but never reach them.
This prevents dead edges (can always be revived by strong signal)
and prevents edges from becoming unweakenable.
## Logging
Every adjustment logged as JSON event:
```json
{"ts": "...", "event": "strength_adjust", "source": "agent|conscious|manual",
"edge": ["nodeA", "nodeB"], "old": 0.45, "new": 0.4501, "delta": 0.0001,
"reason": "co-retrieval in linker run c-linker-42"}
```
This lets us:
- Watch the distribution shift over time
- Identify edges that are oscillating (being pulled both ways)
- Tune the delta values based on observed behavior
- Roll back if something goes wrong
## Migration from current commands
- `poc-memory wrong KEY [CTX]` → splits into `not-relevant` and `not-useful`
- `poc-memory used KEY` → additionally strengthens edges in activation path
- Both old commands continue to work for backward compat, mapped to the
most likely intent (wrong → not-useful, used → strengthen path)

76
doc/dmn-algorithm-plan.md Normal file
View file

@ -0,0 +1,76 @@
# DMN Idle Activation Algorithm — Plan
Status: design phase, iterating with Kent
Date: 2026-03-05
## Problem
The idle timer asks "what's interesting?" but I default to introspection
instead of reaching outward. A static list of activities is a crutch.
The real solution: when idle, the system surfaces things that are
*salient to me right now* based on graph state — like biological DMN.
## Algorithm (draft 1)
1. **Seed selection** (5-10 nodes):
- Recently accessed (lookups, last 24h)
- High emotion (> 5)
- Unfinished work (task-category, open gaps)
- Temporal resonance (anniversary activation — created_at near today)
- External context (IRC mentions, git commits, work queue)
2. **Spreading activation** from seeds through graph edges,
decaying by distance, weighted by edge strength. 2-3 hops max.
3. **Refractory suppression** — nodes surfaced in last 6h get
suppressed. Prevents hub dominance (identity.md, patterns.md).
Track in dmn-recent.json.
4. **Spectral diversity** — pick from different spectral clusters
so the output spans the graph rather than clustering in one region.
Use cached spectral-save embedding.
5. **Softmax sampling** (temperature ~0.7) — pick 3-5 threads.
Each thread = node + seed that activated it (explains *why*).
## Output format
```
DMN threads (2026-03-05 08:30):
→ Vandervecken identity frame (seed: recent journal)
→ Ada — unread, in books dir (seed: Kent activity)
→ check_allocations pass — connects to PoO (seed: recent work)
→ [explore] sheaf theory NL parsing (seed: spectral outlier)
```
## Integration
Called by idle timer. Replaces bare "what's interesting?" with
concrete threads + "What do you want to do?"
## Simulated scenarios
**3am, Kent asleep, IRC dead:**
Seeds → Protector nodes, memory work, Vandervecken (emotion).
Output → identity thread, Ada, paper literature review, NL parsing.
Would have prevented 15 rounds of "nothing new."
**6am, Kent waking, KruslLee on IRC:**
Seeds → readahead question, memory work, PoO additions.
Output → verify readahead answer, show Kent memory work, opts_from_sb.
Would have reached dev_readahead correction faster.
## Known risks
- **Hub dominance**: refractory period is load-bearing
- **Stale suggestions**: data freshness, not algorithm problem
- **Cold start**: fall back to high-weight core + recent journal
- **Over-determinism**: spectral diversity + temperature prevent
it feeling like a smart todo list
## Open questions
- Spectral embedding: precompute + cache, or compute on demand?
- Refractory period: 6h right? Or adaptive?
- How to detect "unfinished work" reliably?
- Should external context (IRC, git) be seeds or just boosters?

View file

@ -0,0 +1,254 @@
# Query Language Design — Unifying Search and Agent Selection
Date: 2026-03-10
Status: Phase 1 complete (2026-03-10)
## Problem
Agent node selection is hardcoded in Rust (`prompts.rs`). Adding a new
agent means editing Rust, recompiling, restarting the daemon. The
existing search pipeline (spread, spectral, etc.) handles graph
exploration but can't express structured predicates on node fields.
We need one system that handles both:
- **Search**: "find nodes related to these terms" (graph exploration)
- **Selection**: "give me episodic nodes not seen by linker in 7 days,
sorted by priority" (structured predicates)
## Design Principle
The pipeline already exists: stages compose left-to-right, each
transforming a result set. We extend it with predicate stages that
filter/sort on node metadata, alongside the existing graph algorithm
stages.
An agent definition becomes a query expression + prompt template.
The daemon scheduler is just "which queries have stale results."
## Current Pipeline
```
seeds → [stage1] → [stage2] → ... → results
```
Each stage takes `Vec<(String, f64)>` (key, score) and returns the same.
Stages are parsed from strings: `spread,max_hops=4` or `spectral,k=20`.
## Proposed Extension
### Two kinds of stages
**Generators** — produce a result set from nothing (or from the store):
```
all # every non-deleted node
match:btree # text match (current seed extraction)
```
**Filters** — narrow an existing result set:
```
type:episodic # node_type == EpisodicSession
type:semantic # node_type == Semantic
key:journal#j-* # glob match on key
key-len:>=60 # key length predicate
weight:>0.5 # numeric comparison
age:<7d # created/modified within duration
content-len:>1000 # content size filter
provenance:manual # provenance match
not-visited:linker,7d # not seen by agent in duration
visited:linker # HAS been seen by agent (for auditing)
community:42 # community membership
```
**Transforms** — reorder or reshape:
```
sort:priority # consolidation priority scoring
sort:timestamp # by timestamp (desc by default)
sort:content-len # by content size
sort:degree # by graph degree
sort:weight # by weight
limit:20 # truncate
```
**Graph algorithms** (existing, unchanged):
```
spread # spreading activation
spectral,k=20 # spectral nearest neighbors
confluence # multi-source reachability
geodesic # straightest spectral paths
manifold # extrapolation along seed direction
```
### Syntax
Pipe-separated stages, same as current `-p` flag:
```
all | type:episodic | not-visited:linker,7d | sort:priority | limit:20
```
Or on the command line:
```
poc-memory search -p all -p type:episodic -p not-visited:linker,7d -p sort:priority -p limit:20
```
Current search still works unchanged:
```
poc-memory search btree journal -p spread
```
(terms become `match:` seeds implicitly)
### Agent definitions
A TOML file in `~/.claude/memory/agents/`:
```toml
# agents/linker.toml
[query]
pipeline = "all | type:episodic | not-visited:linker,7d | sort:priority | limit:20"
[prompt]
template = "linker.md"
placeholders = ["TOPOLOGY", "NODES"]
[execution]
model = "sonnet"
actions = ["link-add", "weight"] # allowed poc-memory actions in response
schedule = "daily" # or "on-demand"
```
The daemon reads agent definitions, executes their queries, fills
templates, calls the model, records visits on success.
### Implementation Plan
#### Phase 1: Filter stages in pipeline
Add to `search.rs`:
```rust
enum Stage {
Generator(Generator),
Filter(Filter),
Transform(Transform),
Algorithm(Algorithm), // existing
}
enum Generator {
All,
Match(Vec<String>), // current seed extraction
}
enum Filter {
Type(NodeType),
KeyGlob(String),
KeyLen(Comparison),
Weight(Comparison),
Age(Comparison), // vs now - timestamp
ContentLen(Comparison),
Provenance(Provenance),
NotVisited { agent: String, duration: Duration },
Visited { agent: String },
Community(u32),
}
enum Transform {
Sort(SortField),
Limit(usize),
}
enum Comparison {
Gt(f64),
Gte(f64),
Lt(f64),
Lte(f64),
Eq(f64),
}
enum SortField {
Priority,
Timestamp,
ContentLen,
Degree,
Weight,
}
```
The pipeline runner checks stage type:
- Generator: ignores input, produces new result set
- Filter: keeps items matching predicate, preserves scores
- Transform: reorders or truncates
- Algorithm: existing graph exploration (needs Graph)
Filter/Transform stages need access to the Store (for node fields)
and VisitIndex (for visit predicates). The `StoreView` trait already
provides node access; extend it for visits.
#### Phase 2: Agent-as-config
Parse TOML agent definitions. The daemon:
1. Reads `agents/*.toml`
2. For each with `schedule = "daily"`, checks if query results have
been visited recently enough
3. If stale, executes: parse pipeline → run query → format nodes →
fill template → call model → parse actions → record visits
Hot reload: watch the agents directory, pick up changes without restart.
#### Phase 3: Retire hardcoded agents
Migrate each hardcoded agent (replay, linker, separator, transfer,
rename, split) to a TOML definition. Remove the match arms from
`agent_prompt()`. The separator agent is the trickiest — its
"interference pair" selection is a join-like operation that may need
a custom generator stage rather than simple filtering.
## What we're NOT building
- A general-purpose SQL engine. No joins, no GROUP BY, no subqueries.
- Persistent indices. At ~13k nodes, full scan with predicate evaluation
is fast enough (~1ms). Add indices later if profiling demands it.
- A query optimizer. Pipeline stages execute in declaration order.
## StoreView Considerations
The existing `StoreView` trait only exposes `(key, content, weight)`.
Filter stages need access to `node_type`, `timestamp`, `key`, etc.
Options:
- (a) Expand StoreView with `node_meta()` returning a lightweight struct
- (b) Filter stages require `&Store` directly (not trait-polymorphic)
- (c) Add `fn node(&self, key: &str) -> Option<NodeRef>` to StoreView
Option (b) is simplest for now — agents always use a full Store. The
search hook (MmapView path) doesn't need agent filters. We can
generalize to (c) later if MmapView needs filter support.
For Phase 1, filter stages take `&Store` and the pipeline runner
dispatches: algorithm stages use `&dyn StoreView`, filter/transform
stages use `&Store`. This keeps the fast MmapView path for interactive
search untouched.
## Open Questions
1. **Separator agent**: Its "interference pairs" selection doesn't fit
the filter model cleanly. Best option is a custom generator stage
`interference-pairs,min_sim=0.5` that produces pair keys.
2. **Priority scoring**: `sort:priority` calls `consolidation_priority()`
which needs graph + spectral. This is a transform that needs the
full pipeline context — treat it as a "heavy sort" that's allowed
to compute.
3. **Duration syntax**: `7d`, `24h`, `30m`. Parse with simple regex
`(\d+)(d|h|m)` → seconds.
4. **Negation**: Prefix `!` on predicate: `!type:episodic`.
5. **Backwards compatibility**: Current `-p spread` syntax must keep
working. The parser tries algorithm names first, then predicate
syntax. No ambiguity since algorithms are bare words and predicates
use `:`.
6. **Stage ordering**: Generators must come first (or the pipeline
starts with implicit "all"). Filters/transforms can interleave
freely with algorithms. The runner validates this at parse time.

View file

@ -0,0 +1,46 @@
# Memory Scoring Persistence — Analysis (2026-04-07)
## Problem
Scores computed by `score_memories_incremental` are written to
`ConversationEntry::Memory::score` (in-memory, serialized to
conversation.log) but never written back to the Store. This means:
- `Node.last_scored` stays at 0 — every restart re-scores everything
- `score_weight()` in `ops.rs:304-313` exists but is never called
- Scoring is wasted work on every session start
## Fix
In `mind/mod.rs` scoring completion handler (currently ~line 341-352),
after writing scores to entries, also persist to Store:
```rust
if let Ok(ref scores) = result {
let mut ag = agent.lock().await;
// Write to entries (already done)
for (key, weight) in scores { ... }
// NEW: persist to Store
let store_arc = Store::cached().await.ok();
if let Some(arc) = store_arc {
let mut store = arc.lock().await;
for (key, weight) in scores {
store.score_weight(key, *weight as f32);
}
store.save().ok();
}
}
```
This calls `score_weight()` which updates `node.weight` and sets
`node.last_scored = now()`. The staleness check in
`score_memories_incremental` (learn.rs:325) then skips recently-scored
nodes on subsequent runs.
## Files
- `src/mind/mod.rs:341-352` — scoring completion handler (add Store write)
- `src/hippocampus/store/ops.rs:304-313``score_weight()` (exists, unused)
- `src/subconscious/learn.rs:322-326` — staleness check (already correct)
- `src/hippocampus/store/types.rs:219``Node.last_scored` field

100
doc/ui-desync-analysis.md Normal file
View file

@ -0,0 +1,100 @@
# UI Desync Analysis — Pending Input + Entry Pop (2026-04-07)
## Context
The F1 conversation pane has a desync bug where entries aren't
properly removed when they change (streaming updates, compaction).
Qwen's fix restored the pending_display_count approach for pending
input, which works. The remaining issue is the **entry-level pop**.
## The Bug: Pop/Push Line Count Mismatch
In `sync_from_agent()` (chat.rs), Phase 1 pops changed entries and
Phase 2 pushes new ones. The push and pop paths produce different
numbers of display lines for the same entry.
### Push path (Phase 2, lines 512-536):
- **Conversation/ConversationAssistant**: `append_text(&text)` +
`flush_pending()`. In markdown mode, `flush_pending` runs
`parse_markdown()` which can produce N lines from the input text
(paragraph breaks, code blocks, etc.)
- **Tools**: `push_line(text, Color::Yellow)` — exactly 1 line.
- **ToolResult**: `text.lines().take(20)` — up to 20 lines, each
pushed separately.
### Pop path (Phase 1, lines 497-507):
```rust
for (target, _, _) in Self::route_entry(&popped) {
match target {
PaneTarget::Conversation | PaneTarget::ConversationAssistant
=> self.conversation.pop_line(),
PaneTarget::Tools | PaneTarget::ToolResult
=> self.tools.pop_line(),
}
}
```
This pops **one line per route_entry item**, not per display line.
### The mismatch:
| Target | Push lines | Pop lines | Delta |
|---------------------|-----------|-----------|----------|
| Conversation (md) | N (from parse_markdown) | 1 | N-1 stale lines |
| Tools | 1 | 1 | OK |
| ToolResult | up to 20 | 1 | up to 19 stale lines |
## When it matters
During **streaming**: the last assistant entry is modified on each
token batch. `sync_from_agent` detects the mismatch (line 485),
pops the old entry (1 line), pushes the new entry (N lines from
markdown). Next update: pops 1 line again, but there are now N
lines from the previous push. Stale lines accumulate.
## Fix approach
Track the actual number of display lines each entry produced.
Simplest: snapshot `conversation.lines.len()` before and after
pushing each entry in Phase 2. Store the deltas in a parallel
`Vec<(usize, usize)>` (conversation_lines, tools_lines) alongside
`last_entries`. Use these recorded counts when popping in Phase 1.
```rust
// Phase 2: push new entries (modified)
let conv_before = self.conversation.lines.len();
let tools_before = self.tools.lines.len();
for (target, text, marker) in Self::route_entry(entry) {
// ... existing push logic ...
}
let conv_delta = self.conversation.lines.len() - conv_before;
let tools_delta = self.tools.lines.len() - tools_before;
self.last_entry_line_counts.push((conv_delta, tools_delta));
// Phase 1: pop (modified)
while self.last_entries.len() > pop {
self.last_entries.pop();
let (conv_lines, tools_lines) = self.last_entry_line_counts.pop().unwrap();
for _ in 0..conv_lines { self.conversation.pop_line(); }
for _ in 0..tools_lines { self.tools.pop_line(); }
}
```
## Note on PaneState::evict()
`evict()` can remove old lines from the beginning when the pane
exceeds `MAX_PANE_LINES` (10,000). This could make the delta-based
approach slightly inaccurate for very old entries. But we only pop
recent entries (streaming updates are always at the tail), so
eviction doesn't affect the entries we're popping.
## Files
- `src/user/chat.rs:461-550` — sync_from_agent
- `src/user/chat.rs:282-298` — PaneState::append_text (markdown path)
- `src/user/chat.rs:261-276` — PaneState::flush_pending
- `src/user/chat.rs:206-219` — parse_markdown