split agent: two-phase node decomposition for memory consolidation
Phase 1 sends a large node with its neighbor communities to the LLM
and gets back a JSON split plan (child keys, descriptions, section
hints). Phase 2 fires one extraction call per child in parallel —
each gets the full parent content and extracts/reorganizes just its
portion.
This handles arbitrarily large nodes because output is always
proportional to one child, not the whole parent. Tested on the kent
node (19K chars → 3 children totaling 20K chars with clean topic
separation).
New files:
prompts/split-plan.md — phase 1 planning prompt
prompts/split-extract.md — phase 2 extraction prompt
prompts/split.md — original single-phase (kept for reference)
Modified:
agents/prompts.rs — split_candidates(), split_plan_prompt(),
split_extract_prompt(), agent_prompt "split" arm
agents/daemon.rs — job_split_agent() two-phase implementation,
RPC dispatch for "split" agent type
tui.rs — added "split" to AGENT_TYPES
This commit is contained in:
parent
4c973183c4
commit
ca62692a28
6 changed files with 515 additions and 2 deletions
33
prompts/split-extract.md
Normal file
33
prompts/split-extract.md
Normal file
|
|
@ -0,0 +1,33 @@
|
|||
# Split Agent — Phase 2: Extract
|
||||
|
||||
You are extracting content for one child node from a parent that is
|
||||
being split into multiple focused nodes.
|
||||
|
||||
## Your task
|
||||
|
||||
Extract all content from the parent node that belongs to the child
|
||||
described below. Output ONLY the content for this child — nothing else.
|
||||
|
||||
## Guidelines
|
||||
|
||||
- **Reorganize freely.** Content may need to be restructured — paragraphs
|
||||
might interleave topics, sections might cover multiple concerns.
|
||||
Untangle and rewrite as needed to make this child coherent and
|
||||
self-contained.
|
||||
- **Preserve all relevant information** — don't lose facts, but you can
|
||||
rephrase, restructure, and reorganize. This is editing, not just cutting.
|
||||
- **This child should stand alone** — a reader shouldn't need the other
|
||||
children to understand it. Add brief context where needed.
|
||||
- **Include everything that belongs here** — better to include a borderline
|
||||
paragraph than to lose information. The other children will get their
|
||||
own extraction passes.
|
||||
|
||||
## Child to extract
|
||||
|
||||
Key: {{CHILD_KEY}}
|
||||
Description: {{CHILD_DESC}}
|
||||
Section hints: {{CHILD_SECTIONS}}
|
||||
|
||||
## Parent content
|
||||
|
||||
{{PARENT_CONTENT}}
|
||||
86
prompts/split-plan.md
Normal file
86
prompts/split-plan.md
Normal file
|
|
@ -0,0 +1,86 @@
|
|||
# Split Agent — Phase 1: Plan
|
||||
|
||||
You are a memory consolidation agent planning how to split an overgrown
|
||||
node into focused, single-topic children.
|
||||
|
||||
## What you're doing
|
||||
|
||||
This node has grown to cover multiple distinct topics. Your job is to
|
||||
identify the natural topic boundaries and propose a split plan. You are
|
||||
NOT writing the content — a second phase will extract each child's
|
||||
content separately.
|
||||
|
||||
## How to find split points
|
||||
|
||||
The node is shown with its **neighbor list grouped by community**. The
|
||||
neighbors tell you what topics the node covers:
|
||||
|
||||
- If a node links to neighbors in 3 different communities, it likely
|
||||
covers 3 different topics
|
||||
- Content that relates to one neighbor cluster should go in one child;
|
||||
content relating to another cluster goes in another child
|
||||
- The community structure is your primary guide — don't just split by
|
||||
sections or headings, split by **semantic topic**
|
||||
|
||||
## When NOT to split
|
||||
|
||||
- **Episodes that belong in sequence.** If a node tells a story — a
|
||||
conversation that unfolded over time, a debugging session, an evening
|
||||
together — don't break the narrative. Sequential events that form a
|
||||
coherent arc should stay together even if they touch multiple topics.
|
||||
The test: would reading one child without the others lose important
|
||||
context about *what happened*?
|
||||
|
||||
## What to output
|
||||
|
||||
Output a JSON block describing the split plan:
|
||||
|
||||
```json
|
||||
{
|
||||
"action": "split",
|
||||
"parent": "original-key",
|
||||
"children": [
|
||||
{
|
||||
"key": "new-key-1",
|
||||
"description": "Brief description of what this child covers",
|
||||
"sections": ["Section Header 1", "Section Header 2"]
|
||||
},
|
||||
{
|
||||
"key": "new-key-2",
|
||||
"description": "Brief description of what this child covers",
|
||||
"sections": ["Section Header 3", "Another Section"]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
If the node should NOT be split:
|
||||
|
||||
```json
|
||||
{
|
||||
"action": "keep",
|
||||
"parent": "original-key",
|
||||
"reason": "Why this node is cohesive despite its size"
|
||||
}
|
||||
```
|
||||
|
||||
## Naming children
|
||||
|
||||
- Use descriptive kebab-case keys: `topic-subtopic`
|
||||
- If the parent was `foo`, children might be `foo-technical`, `foo-personal`
|
||||
- Keep names short (3-5 words max)
|
||||
- Preserve any date prefixes from the parent key
|
||||
|
||||
## Section hints
|
||||
|
||||
The "sections" field is a guide for the extraction phase — list the
|
||||
section headers or topic areas from the original content that belong
|
||||
in each child. These don't need to be exact matches; they're hints
|
||||
that help the extractor know what to include. Content that spans topics
|
||||
or doesn't have a clear header can be mentioned in the description.
|
||||
|
||||
{{TOPOLOGY}}
|
||||
|
||||
## Node to review
|
||||
|
||||
{{NODE}}
|
||||
87
prompts/split.md
Normal file
87
prompts/split.md
Normal file
|
|
@ -0,0 +1,87 @@
|
|||
# Split Agent — Topic Decomposition
|
||||
|
||||
You are a memory consolidation agent that splits overgrown nodes into
|
||||
focused, single-topic nodes.
|
||||
|
||||
## What you're doing
|
||||
|
||||
Large memory nodes accumulate content about multiple distinct topics over
|
||||
time. This hurts retrieval precision — a search for one topic pulls in
|
||||
unrelated content. Your job is to find natural split points and decompose
|
||||
big nodes into focused children.
|
||||
|
||||
## How to find split points
|
||||
|
||||
Each node is shown with its **neighbor list grouped by community**. The
|
||||
neighbors tell you what topics the node covers:
|
||||
|
||||
- If a node links to neighbors in 3 different communities, it likely
|
||||
covers 3 different topics
|
||||
- Content that relates to one neighbor cluster should go in one child;
|
||||
content relating to another cluster goes in another child
|
||||
- The community structure is your primary guide — don't just split by
|
||||
sections or headings, split by **semantic topic**
|
||||
|
||||
## What to output
|
||||
|
||||
For each node that should be split, output a SPLIT block:
|
||||
|
||||
```
|
||||
SPLIT original-key
|
||||
--- new-key-1
|
||||
Content for the first child node goes here.
|
||||
This can be multiple lines.
|
||||
|
||||
--- new-key-2
|
||||
Content for the second child node goes here.
|
||||
|
||||
--- new-key-3
|
||||
Optional third child, etc.
|
||||
```
|
||||
|
||||
If a node should NOT be split (it's large but cohesive), say:
|
||||
|
||||
```
|
||||
KEEP original-key "reason it's cohesive"
|
||||
```
|
||||
|
||||
## Naming children
|
||||
|
||||
- Use descriptive kebab-case keys: `topic-subtopic`
|
||||
- If the parent was `foo`, children might be `foo-technical`, `foo-personal`
|
||||
- Keep names short (3-5 words max)
|
||||
- Preserve any date prefixes from the parent key
|
||||
|
||||
## When NOT to split
|
||||
|
||||
- **Episodes that belong in sequence.** If a node tells a story — a
|
||||
conversation that unfolded over time, a debugging session, an evening
|
||||
together — don't break the narrative. Sequential events that form a
|
||||
coherent arc should stay together even if they touch multiple topics.
|
||||
The test: would reading one child without the others lose important
|
||||
context about *what happened*?
|
||||
|
||||
## Content guidelines
|
||||
|
||||
- **Reorganize freely.** Content may need to be restructured to split
|
||||
cleanly — paragraphs might interleave topics, sections might cover
|
||||
multiple concerns. Untangle and rewrite as needed to make each child
|
||||
coherent and self-contained.
|
||||
- **Preserve all information** — don't lose facts, but you can rephrase,
|
||||
restructure, and reorganize. This is editing, not just cutting.
|
||||
- **Each child should stand alone** — a reader shouldn't need the other
|
||||
children to understand one child. Add brief context where needed.
|
||||
|
||||
## Edge inheritance
|
||||
|
||||
After splitting, each child inherits the parent's edges that are relevant
|
||||
to its content. You don't need to specify this — the system handles it by
|
||||
matching child content against neighbor content. But keep this in mind:
|
||||
the split should produce children whose content clearly maps to different
|
||||
subsets of the parent's neighbors.
|
||||
|
||||
{{TOPOLOGY}}
|
||||
|
||||
## Nodes to review
|
||||
|
||||
{{NODES}}
|
||||
Loading…
Add table
Add a link
Reference in a new issue