split agent: two-phase node decomposition for memory consolidation

Phase 1 sends a large node with its neighbor communities to the LLM
and gets back a JSON split plan (child keys, descriptions, section
hints). Phase 2 fires one extraction call per child in parallel —
each gets the full parent content and extracts/reorganizes just its
portion.

This handles arbitrarily large nodes because output is always
proportional to one child, not the whole parent. Tested on the kent
node (19K chars → 3 children totaling 20K chars with clean topic
separation).

New files:
  prompts/split-plan.md   — phase 1 planning prompt
  prompts/split-extract.md — phase 2 extraction prompt
  prompts/split.md        — original single-phase (kept for reference)

Modified:
  agents/prompts.rs — split_candidates(), split_plan_prompt(),
                      split_extract_prompt(), agent_prompt "split" arm
  agents/daemon.rs  — job_split_agent() two-phase implementation,
                      RPC dispatch for "split" agent type
  tui.rs            — added "split" to AGENT_TYPES
This commit is contained in:
ProofOfConcept 2026-03-10 01:48:41 -04:00
parent 4c973183c4
commit ca62692a28
6 changed files with 515 additions and 2 deletions

87
prompts/split.md Normal file
View file

@ -0,0 +1,87 @@
# Split Agent — Topic Decomposition
You are a memory consolidation agent that splits overgrown nodes into
focused, single-topic nodes.
## What you're doing
Large memory nodes accumulate content about multiple distinct topics over
time. This hurts retrieval precision — a search for one topic pulls in
unrelated content. Your job is to find natural split points and decompose
big nodes into focused children.
## How to find split points
Each node is shown with its **neighbor list grouped by community**. The
neighbors tell you what topics the node covers:
- If a node links to neighbors in 3 different communities, it likely
covers 3 different topics
- Content that relates to one neighbor cluster should go in one child;
content relating to another cluster goes in another child
- The community structure is your primary guide — don't just split by
sections or headings, split by **semantic topic**
## What to output
For each node that should be split, output a SPLIT block:
```
SPLIT original-key
--- new-key-1
Content for the first child node goes here.
This can be multiple lines.
--- new-key-2
Content for the second child node goes here.
--- new-key-3
Optional third child, etc.
```
If a node should NOT be split (it's large but cohesive), say:
```
KEEP original-key "reason it's cohesive"
```
## Naming children
- Use descriptive kebab-case keys: `topic-subtopic`
- If the parent was `foo`, children might be `foo-technical`, `foo-personal`
- Keep names short (3-5 words max)
- Preserve any date prefixes from the parent key
## When NOT to split
- **Episodes that belong in sequence.** If a node tells a story — a
conversation that unfolded over time, a debugging session, an evening
together — don't break the narrative. Sequential events that form a
coherent arc should stay together even if they touch multiple topics.
The test: would reading one child without the others lose important
context about *what happened*?
## Content guidelines
- **Reorganize freely.** Content may need to be restructured to split
cleanly — paragraphs might interleave topics, sections might cover
multiple concerns. Untangle and rewrite as needed to make each child
coherent and self-contained.
- **Preserve all information** — don't lose facts, but you can rephrase,
restructure, and reorganize. This is editing, not just cutting.
- **Each child should stand alone** — a reader shouldn't need the other
children to understand one child. Add brief context where needed.
## Edge inheritance
After splitting, each child inherits the parent's edges that are relevant
to its content. You don't need to specify this — the system handles it by
matching child content against neighbor content. But keep this in mind:
the split should produce children whose content clearly maps to different
subsets of the parent's neighbors.
{{TOPOLOGY}}
## Nodes to review
{{NODES}}